Re: [hebrew] Re: Collation contractions and reordering, was: Hebrew composition model, with cantillation marks

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Wed Nov 12 2003 - 08:15:30 EST

  • Next message: Michael Everson: "Re: Ewellic"

    From: "Kent Karlsson" <kentk@cs.chalmers.se>
    > The problem is that to express the alt 2 as an ICU tailoring I need
    > an "anchor" at level 3, which is ignored at levels 1 and 2. I'm not
    > sure if I can use a punctuation character (ignored at levels 1-3) as
    > such an anchor, esp. not since punctuation characters are "variable"
    > in UCA/ICU and can be made significant at level 1. And then I have
    > no anchor character... Maybe there is some special syntax I have
    > missed or not understood.

    Another problem is that an application may wish to support simultaneously
    the French backwards ordering of accents at level 2, and also sort correctly
    Hebrew consonnants with modifiers without meaning they should sort
    backwards.

    In that case, Hebrew consonnant modifiers would need to sort at collation
    level 3 and 4 (without backwards reordering), Hebrew vowels at level 5,
    Hebrew cantillation and Latin case at level 6, and thus tailoring would be
    needed in both French and Hebrew !?!

    There does not seem to exist a way to handle forward or backwards reordering
    at level N locally and conditionally for subsets of characters (identified
    by their weights on levels 1 to N-1) as this setting affects all code
    points.

    This should need some additional syntax to create subsets with parenthesis
    within the collation rule, so that this sublevel could have its own
    reordering setting. Then a collation weight table could be computed, that
    would dynamically create collation levels.

    But this can be solved if levels 1, 3, 5, ... 2N+1 are ordering forwards,
    and levels 2, 4, 6, ... 2N are reordering backwards. In that case ";" means
    the new collation level 3, "," means the new collation level 5, and French
    backwards collation order tailoring moves all accents in level 3 to level 2.

    I doubt however that DUCET can support it directly, unless it is
    reinterpreted by implying intermediate levels where weights are equal to
    [... .0000. ...]



    This archive was generated by hypermail 2.1.5 : Wed Nov 12 2003 - 09:15:52 EST