Re: Normalisation and directionality (was: how to add all latin (and greek) subscripts)

From: Kenneth Whistler (
Date: Mon Jul 07 2008 - 15:39:42 CDT

  • Next message: William J Poser: "Re: Getting A Newb Started"

    > > U+2135 ALEF SYMBOL has
    > > directionality L and U+05D0 HEBREW LETTER ALEF has directionality R.
    > Allowing normalisation to resolve to a character with different
    > directionality seems to me risky. Isn't there a danger of the strong RTL
    > directionality of U+05D0 messing up layout if substituted for U+2135 in
    > some circumstances?

    Of course. Which is one of the reasons why the mathematical Hebrew
    symbols were separately encoded in the first place. And anyone
    who applies an NFKD/C normalization to a mathematical expression
    containing compatibility characters deserves the hash they will
    get as a result.

    > From a glyph perspective, the design of these two characters
    > legitimately differs, since the symbol characters are often harmonised
    > to Latin cap-height, while the traditional height of Hebrew text is
    > between Latin cap- and x-height.

    Another reason for their separate encoding.

    > This seems to me a very unwelcome decomposition, but I suppose it is
    > frozen thus for all time by stability agreements.

    Keep in mind that in the deep prehistory of Unicode, *compatibility*
    decompositions were added in part as a kind of poor mans
    cross-reference tool and in part as an ideological statement
    about Cleanicode by those opposed in principle to the
    addition of "unnecessary" variants of "real" characters.

    The architectural mistake, IMO, was in defining (much later) a normalization
    form based solely on compatibility decompositions that had
    much less of a consistent rationale than the canonical
    decompositions, and then getting stuck with an uncorrectable
    normalization form that people might end up applying in
    inappropriate circumstances.


    This archive was generated by hypermail 2.1.5 : Mon Jul 07 2008 - 15:41:33 CDT