Re: Logical Storage Order For Complex Vowels in Tai Tham

From: Richard Wordingham (richard.wordingham@ntlworld.com)
Date: Mon Jan 31 2011 - 03:18:09 CST

  • Next message: Karl Pentzlin: "General Category of Latin subscript small letters"

    On Sun, 30 Jan 2011 23:44:12 -0500
    Ed <ed.trager@gmail.com> wrote:

    > On Sun, Jan 30, 2011 at 6:48 PM, Richard Wordingham
    > <richard.wordingham@ntlworld.com> wrote:
    > > On Fri, 28 Jan 2011 10:23:17 -0600
    > > Ed <ed.trager@gmail.com> wrote:
    > >
    > >> Hi, Everyone,
    > >>
    > >> In ISO/IEC JTC1/SC2/WG2 document N3121, "Proposal for encoding the
    > >> Lanna script in the BMP of the UCS", the table of examples on pages
    > >> 2-3 of section 5 "Dependent vowel signs" appears to imply (but note
    > >> that the text does not *explicitely* state) that the decompositions
    > >> shown are in fact the logical storage order.
    > >>
    > >> For most of the examples shown, the logical order makes sense.  But
    > >> for combinations containing U+1A6C OA BELOW, it appears that an
    > >> arbitrary choice has been made regarding the logical storage
    > >> position of U+1A6C.
    > >>
    > >> In the examples in N3121, U+1A6C OA BELOW appears after U+1A6E
    > >> VOWEL E (which makes sense to me) but (for example) before U+1A65
    > >> VOWEL I --and the latter does not make sense to me.
    > >
    > > The combining *vowels* in a syllable have been written in accordance
    > > with the rule for Thai-script character stacks, namely (pre-vocalic)
    > > consonants, and then vowels and tone-marks from bottom to top.  For
    > > example, <U+0E4D THAI CHARACTER NIKHAHIT> follows <U+0E38 THAI
    > > CHARACTER SARA U> when writing Pali.
    > >
    >
    > OK ...
    >
    > > If Unicode hadn't balked at the idea of decomposing characters of
    > > non-zero combining class (e.g. U+0D4B MALAYALAM VOWEL SIGN OO, which
    > > consequently wound as class 0), then we might have assigned U+1A6C
    > > OA BELOW class 220 and U+1A65 VOWEL I class 230.  In accordance
    > > with this principle, I assume that multiple vowels below would be
    > > ordered from top to bottom as with European scripts, but I haven't
    > > found any examples that would make this an issue.
    > >
    > > A surprise from the Thai point of view is the treatment of the -ua
    > > and -ia vowels - these are <U+1A60 SAKOT, U+1A45 WA, 1A6B VOWEL O>
    >
    > ... I can live with that one
    >
    > > and
    > > <U+1A60 SAKOT, U+1A3F LOW YA, U+1A6E VOWEL E>.
    >
    > ... but this one makes no sense at all to me.

    They're parallel! Also note that there is a tendency for /ua/ and
    /ia/ to simplify as /o/ and /e/, a process that seems to be complete
    in Shan, Tai Khuen and Tai Lue.

    However, I don't think a *font* should reject the sequence <U+1A6E
    VOWEL E, U+1A60 SAKOT, U+1A3F LOW YA> - it would make sense as /ei/,
    even though that might not occur in any Tai or Indic language written in
    the Lanna script.

    I can't immediately rule out the occurrence of <U+1A60 SAKOT, U+1A3F
    LOW YA, U+1A6E VOWEL E> in Pali, and I'm not sure one shouldn't write
    Sanskrit in the Lanna script. After all, it supports Sanskritisation.

    In short, a font has to support the sequence whatever view you
    take on the orthography of Tai languages.

    Richard.



    This archive was generated by hypermail 2.1.5 : Mon Jan 31 2011 - 03:24:52 CST