Re: Irish dotless I (was: Languages with letters that always take diacriticals

From: Pavel Adamek (pavel.adamek@ima.cz)
Date: Mon Mar 22 2004 - 10:42:40 EST

  • Next message: Pavel Adamek: "Re: Irish dotless I (was: Languages with letters that always take diacriticals"

    > The point of the joke is that Czech
    > sorts "ch" as a single letter after "h",
    > so using a COMBINING C BEFORE
    > would make this happen automatically,
    > provided the combining character sorted after all letters.
    >
    > Spanish also sorts "ch" as a single letter,
    > but after "c", so here we
    > want a COMBINING H AFTER.

    There is no need for <COMBINING H AFTER>,
    because it can be composed of
    <034F COMBINING GRAPHEME JOINER><H>

    This places CH between
    C WITH UPWARDS ARROW BELOW
    and
    C WITH RIGHT ARROWHEAD ABOVE

    > Of course, this is really not the way
    > to do language-sensitive collation.

    For easy multi-level comparison,
    let us define new characters:
    <COMBINING LEVEL 1 GRAPHEME JOINER>
    <COMBINING LEVEL 2 GRAPHEME JOINER>
    ...

    Then, for example, instead of
    <C><COMBINING CARON><E><S><K><Y><COMBINING ACUTE>
    code it as
    <C><COMBINING LEVEL 1 GRAPHEME JOINER><COMBINING
    CARON><E><S><K><Y><COMBINING LEVEL 2 GRAPHEME JOINER><COMBINING ACUTE>

          P.A.



    This archive was generated by hypermail 2.1.5 : Mon Mar 22 2004 - 11:22:24 EST