Re: Hebrew script in IDN

From: Cary Karp (
Date: Mon Nov 21 2005 - 01:17:44 CST

    Quoting Mark E. Shoulson:

    > I'd venture to say that double-vav, vav-yod, and yod-yod ligatures
    > should have *canonical* decomposition to their constituent letters! I'm
    > sure that would cause problems of some sort, but at least compatibility
    > decomposition is necessary.
    > Doesn't really matter which is the more frequently entered; we normalize
    > strings all the time in Unicode.

    Why are they not being normalized here?

    I assume that at least part of the answer lies in the fourth Yiddish
    digraph 'pasekh tsvey yudn', HEBREW LIGATURE YIDDISH DOUBLE YOD WITH
    HEBREW POINT PATAH (U+05F2 U+05B7). Which (I further assume) would
    decompose and recompose correctly only if the YIDDISH DOUBLE YOD
    ligature were the canonical form. What I don't understand, is why the
    entire pointed digraph wasn't represented as a single precombined
    character, with it then being possible to decompose the other three
    ligatures as Mark suggests.

    With apologies for not having been able to locate the answers to the
    following questions and thus needing to pose them on this list:

    Is there a categorical ban on the assignment of code points to new
    characters that can be represented by combining preexisting characters
    and, if so, where will I find a citable reference to it?


