Re: Last Resort Glyphs (was: About the European MES-2 subset)

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Sun Jul 20 2003 - 09:37:19 EDT

  • Next message: Peter_Constable@sil.org: "Re: Karen Language Representation in Unicode"

    On Sunday, July 20, 2003 3:20 PM, Peter_Constable@sil.org <Peter_Constable@sil.org> wrote:

    > Philippe Verdy wrote on 07/19/2003 01:24:48 PM:
    > > Isn't this page creating the idea for a specific block of
    > > script-representative glyphs, that could be mapped in plane 14
    > > as special supplementary characters ?
    >
    > What would be the purpose of encoding these? I can't think of any.
    > They certainly don't need to be encoded as distinct characters to use
    > in a Last Resort font.

    Mostly for documentation purpose, but also in most system that want to be more informative to users missing a font for a particular script. Michael also judged it to be useful enough to create such a font for Apple, and Apple thought it would be useful for its Mac users. From usefulness comes the use, and thus some legitimacy to encode it within text, as special symbols that should not be represented as the normal glyph, but with these symbols. It's also a fact that these symbols are used (as bitmaps) in the online Unicode charts (not charmaps, sorry for the wrong term), and probably with the Michael's custom font in the published Unicode book.

    It's true that one can make a documentation without actually using a font with assigned codepoints for them. (A collection of SVG graphic could work for publishing purposes).

    But editing the cmap of a TrueType font to include all possible codepoints would require to map all the 17 planes in the cmap, and unless the cmap is compressed, this would require 1,114,112 mappings, or more than 2MB only for the cmap.

    This is probably too much for a default font, even if the system uses paging to access this TrueType font. In fact, a font with only the single glyphs ordered by allocation date for the corresponding block, and an extra table with a a cmap-like table using ranges of codepoints instead of simple entries would probably make things better (of course this would be an extension to the standard tables used by classic fonts). Without such TTF extension, it would be simpler to map only surrogates, and thus use only 128KB
    for a UTF-16 based cmap. I don't know the internals of the OpenType format, may be such compressed format for internal tables already exists that allows representing ranges, or there is space with table IDs allowed for application-specific custom tables.



    This archive was generated by hypermail 2.1.5 : Sun Jul 20 2003 - 10:07:36 EDT