RE: Phetsarat font, Lao unicode

From: James Kass (thunder-bird@earthlink.net)
Date: Mon Jul 09 2007 - 22:16:18 CDT

  • Next message: Aiet Kolkhi: "Re: FW: that font-letter spacing problem - MS Word issue?"

    Philippe Verdy wrote,

    > One problem is that fonts (at least with TrueType/OpenType) are not designed
    > to support reordering and positioning with an unbound number of base
    > characters.

    Font engines handle reordering.

    > For example the GSUB/GPOS tables in TrueType require listing
    > somewhere the complete list of codepoints where such reordering and
    > positioning may be applied, ...

    A listing of glyph IDs is stored in the font. Fonts only store
    codepoints in the "cmap" table. The listing of glyph IDs may
    be a complete list of every glyph ID involved, or it may be
    done using ranges in order to minimize table size.

    > ... something that can't be performed in fonts with
    > the current format, because they don't allow defining character classes in
    > them,

    The OpenType GDEF table format requires assignment of
    glyphs to various character classes. These classes are neither
    user- nor developer-definable, though. Unicode also assigns
    character classes, but only to characters. Complex script
    fonts generally have scads of "presentation form" glyphs
    which aren't characters in the Unicode sense.

    > ... and assigning them pseudo-glyph IDs that can be used in GSUB tables.

    Pseudo-glyph ID might be a misleading phrase. A Glyph ID is
    simply the number of the position of a glyph's data in a font.
    The first glyph, contrary to conventional counting methods,
    is given the glyph ID of zero. And so forth.

    > It seems then more reasonable that renderers implement these character
    > classes, and recognize which fonts support such reordering : ...

    And that's exactly what font engines are supposed to do.

    > ... the renderer
    > for example could be looking for rules based on the dotted circle symbol,
    > and automatically infer the other applicable rules for other Common symbols,

    Does this assume that the dotted circle is part of the encoded text?
    It normally isn't, it's inserted (to the display only) by (at least) one
    popular font engine. Regardless, other symbols will most always
    have completely different metrics. It's unlikely that a font engine
    will calculate the different heights, advance widths, and so forth,
    in order to approximate a correct placement of the combining
    character glyph. It's probably equally unlikely that a font developer
    will add a potentially infinite number of GPOS rules to a font's tables
    in order to accomplish this with every conceivable arbitrary base
    character glyph.

    Then there's that performance hit...

    Best regards,

    James Kass



    This archive was generated by hypermail 2.1.5 : Mon Jul 09 2007 - 22:20:29 CDT