From: Richard Wordingham (firstname.lastname@example.org)
Date: Sun Jan 30 2011 - 17:48:16 CST
On Fri, 28 Jan 2011 10:23:17 -0600
Ed <email@example.com> wrote:
> Hi, Everyone,
> In ISO/IEC JTC1/SC2/WG2 document N3121, "Proposal for encoding the
> Lanna script in the BMP of the UCS", the table of examples on pages
> 2-3 of section 5 "Dependent vowel signs" appears to imply (but note
> that the text does not *explicitely* state) that the decompositions
> shown are in fact the logical storage order.
> For most of the examples shown, the logical order makes sense. But
> for combinations containing U+1A6C OA BELOW, it appears that an
> arbitrary choice has been made regarding the logical storage position
> of U+1A6C.
> In the examples in N3121, U+1A6C OA BELOW appears after U+1A6E VOWEL E
> (which makes sense to me) but (for example) before U+1A65 VOWEL I
> --and the latter does not make sense to me.
The combining *vowels* in a syllable have been written in accordance
with the rule for Thai-script character stacks, namely (pre-vocalic)
consonants, and then vowels and tone-marks from bottom to top. For
example, <U+0E4D THAI CHARACTER NIKHAHIT> follows <U+0E38 THAI CHARACTER
SARA U> when writing Pali.
If Unicode hadn't balked at the idea of decomposing characters of
non-zero combining class (e.g. U+0D4B MALAYALAM VOWEL SIGN OO, which
consequently wound as class 0), then we might have assigned U+1A6C OA
BELOW class 220 and U+1A65 VOWEL I class 230. In accordance with this
principle, I assume that multiple vowels below would be ordered from
top to bottom as with European scripts, but I haven't found any
examples that would make this an issue.
A surprise from the Thai point of view is the treatment of the -ua and
-ia vowels - these are <U+1A60 SAKOT, U+1A45 WA, 1A6B VOWEL O> and
<U+1A60 SAKOT, U+1A3F LOW YA, U+1A6E VOWEL E>. The order put forward
was based on native intuition as reported by Martin Hosken.
> As shown in the attached image, I would have expected that
> U+1A65 VOWEL I appear *BEFORE* U+1A6C OA BELOW . My expectation
> follows from the order in which I write the marks: That is, I write
> Tai Tham on paper from left to write, and from top to bottom.
Only Indo-Chinese Indic scripts have permission to follow
the handwriting order, publishing in Tai Tham isn't strictly legal
in Thailand, and Lao use was dismissed with contempt.
More seriously, alternations between vertical and horizontal stacking
of marks above indicate that if left-to-right ordering means anything,
their order is from bottom to top rather than top to bottom. In
particular, the normal order goes pure vowel, mai kang, tone mark.
(There may be constraints on vertical stacking. Printed Tai Khuen is
restricted to three rows - above, base consonant line, and below.
Multiple characters above or below may invade the territory of the
following consonant, with some bizarre consequences. Much Northern
Thai also only allows one row below - the 'Northern Dictionary of
Palm-Leaf Manuscripts' is a good example, and the more deeply
descending sequences are often deliberately avoided. For example, some
books' drills in final consonants imply that vowels below are not
followed by subscript final consonants.)
> So I
> write vowel marks appearing *ABOVE* base consonants before I write
> vowel marks *BELOW*.
> U+1A6C OA BELOW is the most common vowel sign that can result in this
> kind of confusion. However it may not be the only one. There are a
> number of dipthong and tripthong vowels which occur in the various Tai
> languages and these are of course written using various combinations
> of 2 or more Tai Tham vowel signs.1A6A;TAI THAM VOWEL SIGN UU
> It appears that N3121 was not the "final" version document used when
> Tai Tham was approved for encoding; but I am not clear what the
> subsequent document(s) were?
> In any case, the examples provided in N3121 seem to me insufficient
> and, as already noted, nowhere does it explicitely state in N3121 that
> the decompositions represent the backing store order.
I trust you intend to be able to render words such as ᨻᩦ᩠᩵ᨶᩬ᩶ᨦ <U+1A3B
LOW PA, U+1A66 SIGN II, U+1A75 TONE-1, U+1A60 SAKOT, U+1A36 NA, U+1A6C
OA BELOW, U+1A76 TONE-2, U+1A26 LETTER NGA> piinɔɔŋ, which in a font
without overhang support or proper vertical kerning/ligaturing one might
attempt to write as <U+1A3B, U+1A6C, U+1A60, U+1A36, U+1A66, U+1A75,
> Perhaps there is a need for a separate document to clarify what the
> backing store order should be for dipthong and tripthong vowels, inter
> alia, for Tai languages/dialects using Tai Tham script?
Don't forget the issue of consecutive syllables sharing the initial
consonants. Not every style writes <U+1A7B TAI THAM SIGN MAI SAM> to
indicate the duplication, so in principle you could get vowels in
either order. Now, I presume the careful one-akshara spellinɡ of
ᨡᩮᩢ᩶᩻ᩬᩣ᩠ᨦ khaokhɔɔŋ 'possessions' would be <U+1A21 HIGH KHA, U+1A6E
SIGN E, U+1A62 MAI SAT, U+1A76 TONE-2, U+1A7B MAI SAM, U+1A6C OA BELOW,
U+1A63 SIGN AA, U+1A60 SAKOT, U+1A26 NGA>. Now, if one omits the mai
sam, as even the Maefahluang dictionary does, does the spelling simply
change by omitting U+1A7B, or does it rearrange to <U+1A21 HIGH KHA,
U+1A6E SIGN E, U+1A6C OA BELOW, U+1A62 MAI SAT, U+1A76 TONE-2, U+1A63
SIGN AA, U+1A60 SAKOT, U+1A26 NGA? A similar case with the vowel
below coming first is given by the contraction ᨧᩩ᩵ᩢᨦᨾᩦ <U+1A27 HIGH
CA, U+1A69 SIGN U, U+1A75 TONE-1, U+1A62 MAI SAT, U+1A26 NGA, U+1A3E
MA, U+1A66 SIGN II> of ᨧᩩ᩵ᩢᨦᨾᩦ <U+1A27, U+1A69, U+1A75, U+1A26, U+1A27,
U+1A62, U+1A60, U+1A20 HIGH KA, U+1A3E, U+1A66> cuŋ cak mii.
This archive was generated by hypermail 2.1.5 : Sun Jan 30 2011 - 17:53:23 CST