From: Sinnathurai Srivas ([email protected])
Date: Sat Jun 25 2005 - 16:07:53 CDT
Version 2 (Please provide comments.)
Tamil Collation vs Transliteration/Transcription Encodinng
Though it undergoes numerous implementation problems, Unicode is based on a
highly sophisticated technical architecture.
Unfortunately, on the issue of collation, due to designs of ISCII, Unicode
has to abandon the sorting based encoding of Tamil in favour of
transliteration based encoding.
While Devanagari has the upperhand in keeping it's sorting based encoding
all other Indic languages were encoded transliteration based encoding. The
saddest thing is that the transliteration based encoding can not operate as
planned as there never is a one to one mapping between languages.
For example Tamil K will indicate k, h, g, q, x and other related phoneme
while Devanagari would have individual character shapes representing
individual phonemes. Tamil is based on Alphabet based phonemic system, while
Devanagari is based on phonemic system.
If Unicode changes it's policy from the unimportant and non functioning
transliteration based encoding to one of natural sorting based encoding
would be a superior solution.
However, expecting Unicode to change it's encoding philosophy of ISCII based
transliteration encoding to one of natural sorting based encoding is not
going to be easy.
We will need to work on what is imposed on Tamil and find software solutions
to resolve sorting requirements.
Tamil Grammar, probably the worlds oldest written and a sophisticated
Grammar, clearly defines authography for Tamil. Here again Unicode does not
seem to beleive that a language can have Grammar defining it's authography.
In this regard it is not too late to bring to the attention of Unicode
consortium that how authography is defined and how sorting is used.
We will analise the requirements to be able to collate Tamil, by ways of
software fixes.
To be continued....
This archive was generated by hypermail 2.1.5 : Sat Jun 25 2005 - 16:10:10 CDT