Re: How to write Armenian ligatures?

From: Asmus Freytag (
Date: Sat Nov 24 2007 - 21:43:08 CST

  • Next message: Doug Ewell: "Re: Roman numerals (was Re: How to write Armenian ligatures?)"

    On 11/24/2007 2:31 PM, John H. Jenkins wrote:
    > I can't speak for other platforms, but on Mac OS X, the normal
    > behavior for Latin text is that plain text uses whatever ligatures the
    > font has on by default.
    > I, at least, would argue that ligation controls don't (as a rule)
    > belong in plain text, since ligation is intimately bound with the
    > typographic needs of a specific font.
    That sounds like the unreconstructed and simplistic view of the
    character glyph model with which Unicode started out, and which caused
    (and causes) all kinds of trouble.

    To summarize some of the key issues that should by now be recognized as
    requirements for the full character glyph model:

    1) Some scripts have required ligatures
    2) Some font styles have required ligatures ('ch' in Fraktur, but not in
    most other Latin styles)
    3) Some languages have *prohibited* ligatures
    4) Some scripts require additional levels of choice (e.g. up to four in
    5) Some users have special needs for additional control over ligatures
    (extreme design)

    The full character glyph model needs to allow for all of these - while
    minimizing impact on what gets encoded.

    In that sense, the OpenType technology has gotten a lot of things right
    by providing the three levels of ligatures that John Hudson summarized
    here (required, standard, and discretionary).

    However, feature 3and 4 will not work without giving the user a way to
    (locally) override the global settings. As the requirement for
    prohibition of ligatures often comes from orthography (or if you want,
    from the intersection of orthography and typography) it is appropriate
    to use coded characters. (In the kinds of cases that were discussed on
    this list at length, allowing a ligature changes the possible readings
    of the word in question - that's no longer typography pure, that's

    Luckily, it seems, that the ZWNJ is the only character that's required
    for language-specific requirements - at least I don't know an example to
    the contrary. The problem is, when should the user supply them. If
    ligatures are disabled by default, then the ZWNJ is not needed (the
    option of not using any ligatures is permissible in many type styles
    (though not Fraktur)).

    In some sense then, if non-required, but non-fancy, ligatures were
    always enabled, users would (need to) supply the ZWNJ by default, and
    text would be correctly coded. But leaving ligatures enabled by default
    makes the use of plain text controls, in this case, ZWNJ required.
    Otherwise, you get what are essentially misspelled words (or
    mis-typeset, if you want) in these languages.

    You can't have one without the other, that's why I called your statement
    a "simplistic" view of the character glyph model.

    I think it's worth repeating that there is another dimension where it is
    in fact up to the font to make decision on ligatures.

    A Fraktur font would need to supply all the ligatures that are required
    in Fraktur (e.g. also the 'ch' ligature) and mark them as "required",
    because in that typographical style they are required, even though in
    other Latin styles they may not be. A monospaced font would not normally
    support any ligatures as either required or default (as John Hudson
    pointed out a couple of posts ago), because doing so would violate the
    general expectation that users of monospaced fonts have of getting one
    display position for each 'character'.

    For Indic scripts, the model has come together over the last decade, and
    it looks like all the distinctions can be represented on the character
    code level.

    Finally for designers, you'll always need additional controls, which
    specialized applications and fonts will supply.

    So, it looks like the character glyph model is finally getting broad
    enough to meet the real requirements, but only if you allow both turning
    on some ligatures by default, as well as the use of at least ZWNJ.

    There are two unresolved concerns, and not minor ones.

    One, if ligatures are suddenly enabled, what will be with all the
    existing texts that were written without ZWNJ inserted. Because of the
    nature of this issue it is *not* possible, to supply these via the
    layout system. That's a real problem for languages that have such

    Two, many non-monospaced fonts are used in environments (such as text
    input widgets) where a more 1:1 relation between what is typed and what
    is displayed is appropriate.

    Both of these essentially require that at least some ligatures can be
    globally disabled for certain documents or certain uses. Because some
    ligatures are required (and because this requirement varies by font
    style, not merely script) the old, "ligatures on, unless globally
    disabled, or overridden in script specific instance by character code
    (e.g as in lam-alif)" model was to simplistic.

    Technology, if widely implemented, feeds back onto usage. It would be
    interesting to see whether widespread adoption of a simplistic ligatures
    on by default model would result in languages that now have prohibited
    ligatures to give up on this concept. Already, the inability of
    spell-checkers to handle unlimited compound nouns has had a noticeable
    impact on German spelling (books have been written on that subject...and
    "Word" is the villain).


    This archive was generated by hypermail 2.1.5 : Sat Nov 24 2007 - 21:45:11 CST