Re: E0000 Language Tags for Some Obscure Languages

From: Asmus Freytag (
Date: Sun Feb 27 2005 - 15:12:00 CST

  • Next message: Doug Ewell: "Doug's Not Doug (was: Re: E0000 Language Tags for Some Obscure Languages)"

    At 09:04 AM 2/27/2005, wrote:
    >What is the reaction to using E0000 language tags ...

    The UTC has settled the question that general use of language tags is
    strongly discouraged, as they are stateful and therefore don't belong in
    plain text. Proposals that don't start from that point, are non-starters.

    >But the OT font should be able to recognize an E0000 codepoint string (as just
    >codeponts) and do "context" glyph swapping [*].

    This statement combines a thorough misunderstanding of OT and of language tags.

    Language tags act like delimiters that define an stateful behavior (i.e.
    on/off). A font may be called on for glyphs for characters on page 43 of a
    100 page document - with the entire text in the same language. In that case
    there would be no language tag characters in the run of text to act as
    'context'. Language information belongs in higher level protocols, wherever

    >This is old but hopefully reliable:

    This is dated 8-31-2000, and therefore indeed old. Contrary to your
    suggestion, it has long been superseded, as you can see if you look at where it says:

    UTR #7: Plane 14 Characters for Language Tags has been incorporated into
    the Unicode Standard Version 3.1, and is thus now superseded. The last
    version of that document before it was superseded can be found at

    Each Unicode Technical Report has a link to it's 'most recent version' in
    the header. There's never an excuse to not check first what the most
    up-to-date version of the Standard says, and courtesy to others demands
    that you do that before trying to engage them in a discussion of a proposal
    for a particular use of the standard.


    This archive was generated by hypermail 2.1.5 : Sun Feb 27 2005 - 15:12:38 CST