From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Sat Aug 20 2005 - 05:38:10 CDT
From: "Richard Wordingham" <richard.wordingham@ntlworld.com>
> I can understand the gripes about 'level-2' v. 'level-1' implementation, 
> though.  I find it distinctly irritating that the newly added Tamil 
> consonant SHA U+0BB6 won't combine with vowels in Window XP, and seems 
> unlikely to unless one buys otherwise unneeded word processing packages.
I understand that too: if one can demonstrate that Tamil is correctly 
handled using level-1 only implementation (yes this can be tested using 
PUAs) then it will establish the correct processing rules for handling Tamil 
the way it is encoded for now in Unicode.
So it's up to Unicode to verify that the processing based on the current 
standard encoding is consistent with the level-1 implementation based on 
PUAs. This could be tested by using a mapping table between the two 
representations, and comparing the results between the level-2 
implementation with standard Unicode, and level-1 implementation with the 
"New Tamil" PUA block...
But one must also verify that this will be consistent with the Indian ISCII 
standard for Tamil... (there may be a few quirks for exceptional cases 
normally absent of humane language, so it won't matter there.
Another option would be to develop a "New Tamil" charset for test, and 
establishing a mapping table with ISCII (this will not require allocating 
PUAs). When this works, one can then define the correct mapping table 
between "New Tamil" and standard Unicode (without using PUAs!).
Although I don't like the idea of publishing new 8-bit charset standards, it 
certainly helps when it allows reducing the number of cases to test and 
support for supporting correctly a script or language.
This archive was generated by hypermail 2.1.5 : Sat Aug 20 2005 - 05:40:11 CDT