Re: LC_CTYPE locale category and character sets.

From: John Cowan (cowan@locke.ccil.org)
Date: Mon Jul 27 1998 - 12:04:53 EDT


Peter_Constable@sil.org wrote:

> Could you please explain your statement that similar things were done
> in the Thai block (similar to what was done with Turkish dotted i).

I meant that compatibility with the existing encoding was given
priority over Unicode theory.

> In what respects is the Thai block non-Unicode in flavour?

The Thai script, like most of the others in the area, is Brahmic
in nature, but no Brahmic harmonization was applied to it;
the order of characters tracks TIS 620-2529 instead.

All the elements of the script are labeled THAI CHARACTER without
discrimination (the only use of CHARACTER in the long names), rather
than the usual LETTER, CONSONANT, VOWEL, VOWEL SIGN, SYMBOL, etc.,
except for U+0E3F THAI CURRENCY SYMBOL BAHT, which has a
harmonized name.

Thai left-combining vowels are encoded before their consonants
(i.e. in visual order), unlike all other Unicode vowel signs.
(Alternative characters for phonetic order were encoded in Unicode
1.0 but quickly removed.)

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:40 EDT