RE: But E0000 Custom Language Tags Are Actually *Required* For Use By Unicode

From: Richard T. Gillam (
Date: Thu Mar 03 2005 - 10:02:01 CST

  • Next message: Dean Snyder: "Re: Ambiguity and disunification"

    >Unicode has as much responsibility to see that some means is provided
    for the display of Serbian 't', and Old Athenian
    >lambda, as it has to make sure there is some means to display not just
    the basic Indic syllables that get codepoints,
    >but the needed ligatures as well.

    Unicode's job isn't to "make sure there is some means to display"
    things. It's to make sure there is some way to REPRESENT things. There
    is enough information in a Unicode code stream for a rendering engine to
    decide which Indic ligatures it needs to display, but there's no
    requirement in Unicode that it display those ligatures. There are
    mechanisms in Unicode that can be used to control whether a given
    ligature appears in a given spot, but fonts are not required to respond
    to these controls.

    And even those controls are useful only in local contexts. You wouldn't
    use them to govern whether a particular ligature should or shouldn't
    appear in a given document everywhere it might be appropriate; you do
    that by picking and appropriate font or, if you're using a font that
    gives you both choices, an appropriate font feature. And fonts and font
    features or not Unicode's responsibility.

    Same thing with Serbian t. Unicode provides a way to represent it. And
    this isn't something you choose to present or not in a local context--
    either a piece of text is Serbian or is isn't, and all ts in that
    passage will be displayed the same way. You specify that by using a
    Serbian font or not, or by enabling a font feature in a bilingual
    Cyrillic font. Unicode doesn't get involved in this.

    If you want to get away from requiring specific fonts and just have an
    abstract way of saying what range of glyph variation is appropriate,
    that's still outside the scope of Unicode. That's where technologies
    like XML, CSS, and SVG come into play.

    Unicode provides for a way to represent all the things you mention. It
    does not provide ways to make distinctions between them in plain text,
    because these are not plain-text distinctions.

    I'm not saying the distinctions don't need to be made, merely that
    Unicode isn't the peoper mechanism to use. Not everything is plain
    text. If you want to get traction in your arguments, you have to
    demonstrate that the distinctions you want to make are *plain text*
    distinctions, not merely that they're necessary.

    --Rich Gillam
      Language Analysis Systems, Inc.

    This archive was generated by hypermail 2.1.5 : Thu Mar 03 2005 - 10:02:40 CST