Re: Codepoint Differentiation

From: Mark E. Shoulson (
Date: Wed Feb 23 2005 - 07:32:14 CST

  • Next message: Patrick Andries: "Re: Codepoint Differentiation"

    Doug Ewell wrote:

    >>And there are absolutely no problems with a Korean character showing
    >>up in the middle of their Web page -- as may currently occur with the
    >You have exactly the same issues with font dependency using this
    >approach as you would with the PUA, except that your solution requires
    >"smart fonts" and the PUA solution doesn't.
    Well, actually PUA *does* require some smarts, if you're planning on
    writing in a RTL language. You need specialized software that deals
    with it that way. Actually, no, a RLO character should do, I guess.

    Would it be such a crime to set the directionality of the PUA to
    "neutral"? That way the PUA would be... well... neutral. Like it
    should be.

    >>So we now see how a small block of codepoints, with almost zero impact
    >>on processing, can vastly increase the usefulness of Unicode to real-
    >>world people.
    >1. Interspersing a variation selector after EVERY letter does not
    >constitute "almost zero impact."
    >2. Variation selectors are for making minor glyphic distinctions within
    >a character, not for turning Latin into Klingon and vice versa.
    >3. This mechanism does not "vastly increase the usefulness of Unicode"
    >to anyone. Mark Shoulson already explained that Klingon-alphabet users
    >get along just fine with a PUA-based solution.
    "Get along just fine" is a little much; kind of the way every other
    non-Latin charset "gets along just fine" with the ISO-8859-* system. If
    people are using it, it should be encoded officially.

    >4. Adopting the style of a professor lecturing his students does not
    >change any of points 1 through 3.
    This, and Doug Ewell's other points, are all good ones. Do not mistake
    this letter for disagreement with him.

    >>What we have done is turn Unicode from a "one dimensional array" into
    >>a "two dimensional array". The primary (and defaultable) glyphs and
    >>meanings get real codepoints along the main axis, and secondary (and
    >>allowably ignorable) glyphs and/or meanings get "differentiators"
    >>along the secondary axis.
    Indeed. The concept you have of Unicode differs fundamentally from the concept that the Unicode guiding bodies have for it. It isn't that they don't understand you, it isn't that they don't see what you're proposing, but what you're proposing does not, to them, qualify as what they set out to do.


    This archive was generated by hypermail 2.1.5 : Wed Feb 23 2005 - 07:34:08 CST