Re: Why people still want to encode precomposed letters

From: John Hudson (
Date: Tue Nov 25 2008 - 12:52:00 CST

  • Next message: verdy_p: "Re: Why people still want to encode precomposed letters"

    philip chastney wrote:

    > I meant that there are relatively few applications which make use of
    > OpenType tables

    Examples? In my experience, there are relatively few applications today
    that do not make use of OTL tables, either directly or via system or
    third-party layout engines. The level of support for particular layout
    features may vary, but these generally involve UI limitations and
    features that are intended to be on by default are most likely to be
    supported. GPOS mark-to-base and mark-to-mark positioning is supported
    for European in any app using recent versions of the Uniscribe engine or
    ICU, include web browsers. In fact, app support for this feature is more
    advanced and widespread than font support is.

    > but back to the here and now -- to recap: the only way to support the
    > open-ended requirements of the philologists and the mathematicians is
    > via OpenType tables

    [Aside: mathematics requires things that OTL alone cannot handle, hence
    Microsoft's 'math' font table and dedicated math layout engine.]

    > these same OpenType tables can be used to support known composites in
    > natural languages, and would make for a neater, smaller, font, with
    > better coverage of accented letters than the use of preformed
    > composites, and would (could, should) eliminate any visual differences
    > between, say, U+00e9 and U+00e5, U+0301

    Let's clarify a point of terminology: in font production, a 'composite'
    is a glyph that uses one or more other glyphs as components, e.g. a
    precomposed glyph for displaying a diacritic base+mark(s) combination. I
    believe what you are talking about -- in terms of making neater, smaller
    fonts -- is dynamic mark positioning, not composites, which is also what
    I am advocating. But in my experience there is neither the need nor a
    significant benefit (contrary to the cost) in trying to define a target
    set of combinations for dynamic mark positioning, because the same
    mechanism can be designed to address arbitrary combinations.

    This discussion began with the observation that certain fonts appear to
    support generic mark positioning for arbitrary combinations of Latin
    letters plus marks, but not for Cyrillic letters. I happen to know some
    of these fonts inside-out, and know that, in fact, they do support
    *some* Cyrillic plus mark combinations: they support a targeted subset
    that someone, somewhere identified as 'known combinations'. So the very
    impetus of this discussion is an example of the downside of relying on a
    subset of known combinations rather than generic positioning for
    arbitrary combinations: you end up not supporting unknown combinations
    which may, in fact, occur. The sad irony of such fonts is that exactly
    the same anchor position that is used to position e.g. an acute accent
    above a Cyrillic letter could be used to position any other above-centre
    mark, and generic positioning costs nothing more in terms of font
    development work than anchors for a subset of marks.

    > once such a file exists, it could be used to test existing fonts for
    > completeness, and it would provide a specification standard for new fonts

    > also, it would provide a suitable filing place for newly discovered
    > composites in minority languages, and the world would be a slightly
    > better place, as a result

    Such a list would be useful, as John Jenkins noted, in providing
    real-world test cases. But I am not convinced that it is worth the
    effort. In 1998, I compiled lists of base+mark combinations occurring in
    the orthographies of about 200 African languages. It not only took a
    long time, but the resulting data was unreliable because of instability
    in the orthographies and their regional implementations. Precise,
    targeted subsets of known combinations are difficult to compile and of
    limited use. Generic positioning of marks categorised by shared anchors
    is much easier to achieve and provides flexible results.

    John Hudson

    Tiro Typeworks
    Gulf Islands, BC
    You can't build a healthy democracy with people
    who believe in little green men from Venus.
                        -- Arthur C. Clark

    This archive was generated by hypermail 2.1.5 : Tue Nov 25 2008 - 12:54:47 CST