RE: But E0000 Custom Language Tags Are Actually *Required* For Use By Unicode

From: Peter Constable (petercon@microsoft.com)
Date: Wed Mar 02 2005 - 17:04:30 CST

  • Next message: Peter Kirk: "Re: Unicode Stability"

    > From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org]
    On Behalf
    > Of UList@dfa-mail.com

    > 4. The sole correct way to access this local variation of the
    Cyrillic script
    > is with a "language tag".

    There is not *one, sole* correct way. There's absolutely nothing wrong
    with someone creating a font that is specific to Serbian.

     
    > 5. The local versions of the Greek script are *identical* in nature
    to this
    > local version of the Cyrillic script. They are particular, local
    styles of the
    > same international script system, as used by different Ancient Greek
    states.
    > They are in fact referred to by the technical term "epichoric",
    meaning
    > literally, "local".

    The fact that there are typographic variations is common to both cases.
    It is *not* obvious that they are otherwise identical, however. I know
    that the Serbian italic variants are a conventional distinction from
    Russian and other languages that is needed across all Serbian users. In
    the Greek case, all I know is that there are a number of variations in
    palaeographic texts. I have no idea to what extent sets of conventions
    (iso-glyph categories) can be defined across all palaeographic Greek
    texts, or to what extent sets of conventions can be defined in terms of
    what modern users (the palaeographers of today) are looking for. It may
    seem obvious to you that these two situations are identical, but that
    won't be obvious to me until I have seen input from the Greek
    palaeographic community as a whole.

    > 7. The only correct way to access local variations of the Greek
    script is
    > with "language tags".

    Another invalid assumption (see #4).

     
    > 9. The only correct way to access local variations of scripts,
    including
    > variations of Cyrillic script, of Greek script, of Berber script,
    etc., is
    > with "language tags".

    Yet another invalid assumption.

    > 10. *But* I have previously demonstrated, fairly obviously, that it
    is hardly
    > practical for Microsoft to add long lists of OpenType "language tags"
    for
    > something as obscure as extinct local variations of Greek script. It
    is
    > certainly not practical for Microsoft to add lists of every of
    possible local
    > variation of every obscure script such as Berber.
    >
    > 11. *Therefore*, some kind of "custom language tag" system is a
    > *requirement*, for Unicode to function as it is claimed it is
    *intended* to function.

    An invalid conclusion based on invalid premises. You might be able to
    make a case that a custom language tagging system might be a useful
    thing to do (on which I will reserve judgment for now), but it is *not*
    the case that this is needed for Unicode to function as it is claimed it
    is intended to function.

    > 12. This is not an obscure, personal desire of mine. It is an
    essential and
    > inherent component of the approach Unicode itself has created (but
    perhaps
    > failed to think through to its conclusion).

    At the moment, I see no indication that it is anything more than a
    personal desire.

     
    > 13. Unicode has in fact created exactly this custom language tag
    system with
    > the E0000 block. [LANGUAGE][x}[-][custom_language_name][END LANGUAGE].
    But
    > then this system has been "strongly disrecommended" and therefore is
    not
    > likely to be implemented by font technologies.

    This is complete nonsense. Unicode encoded characters that can be used
    as metadata would be but in the absence of markup mechanisms. They have
    been called "language tag" characters because they were requested (and
    reluctantly granted) specifically for a protocol that wanted to add
    language identification to text, but that does not entail that they must
    be used for that purpose only -- it doesn't specify anything about
    *where* they should be used or with what meaning. This does not imply
    that Unicode has created a language tagging system, however, and much
    less a custom language tag system.

    > 14. THEREFORE, in order to make it actually possible to use Unicode's
    *own*
    > stated and vigorously defended philosophy on the sole correct means of
    > accessing local script variants -- for local script variants which are
    too
    > obscure to receive official language tags -- Unicode must do one of
    the following:

    I don't think there's any need to comment on these conclusions when the
    argumentation that lead to them is flawed.

    > - you may have noticed from this discussion that it seems "(local)
    script
    > tags" are more appropriate than "language tags" for all these matters,
    > including Serbian 't';

    You may have noticed that I clarified from the outset (when you first
    asked about this on the OpenType list) that OpenType "Language-System"
    tags are not the same as language tags.

    Peter Constable



    This archive was generated by hypermail 2.1.5 : Wed Mar 02 2005 - 17:05:36 CST