Re: Encoding Pronunciation (was: Comment on PRI 98: IVD Adobe-Japan1 (pt.2))

Date: Sat Mar 24 2007 - 08:52:07 CST

  • Next message: "Re: Comment on PRI 98: IVD Adobe-Japan1 (pt.2)"

    The welsh examples below remind me of the 'normal' sorting order for Zhuang

    which would for example be:-

    na (thick), naz (field), naj (face), nax (aunt), naq (arrow), since z,
    j, x, and q indicate the tones 2, 3, 4, and 5 respectively and whats
    more are never used to represent sounds, h which repesents the tone 6
    is also used at the beggining of a syllable to represent sound. There
    are one or two other rules. In the Cyrillic form of Zhuang an
    automatic sort might even come out in the right order for the simple
    example above.

    John Knightley

    > But don't we already have something like that for Welsh and Slovak?
    > The lower case Welsh letter 'ng', which represents a velar nasal, is
    > encoded as <U+006E LATIN SMALL LETTER N, U+0067 LATIN SMALL LETTER G>
    > (e.g. Angharad), while the 'coincidental' occurrence of a nasal and a
    > voiced velar stop should be encoded as <U+006E, U+034F COMBINING
    > GRAPHEME JOINER, U+0067> (e.g. Bangor and Llangollen) if you want it to
    > collate properly without dictionary look-ups. (Without CGJ,
    > 'Llangollen' would collate before 'Llanberis', as 'ng' comes between
    > 'g' and 'h'.) I believe that the distinction between <U+17D2 KHMER
    > U+178F KHMER LETTER TA> is likewise phonetic (rather than
    > etymological), but I can no longer find the definition of the
    > difference between these two graphically identical sequences. The
    > crucial point in at least the Welsh and Slovak cases is that the
    > difference affects collation order.

    This message sent through Virus Free Email

    This archive was generated by hypermail 2.1.5 : Sat Mar 24 2007 - 08:55:57 CST