Tamil Collation vs Transliteration/Transcription Enc Version2

From: Sinnathurai Srivas (sisrivas@blueyonder.co.uk)
Date: Sat Jun 25 2005 - 16:07:53 CDT

    Version 2 (Please provide comments.)

    Tamil Collation vs Transliteration/Transcription Encodinng

    Though it undergoes numerous implementation problems, Unicode is based on a
    highly sophisticated technical architecture.

    Unfortunately, on the issue of collation, due to designs of ISCII, Unicode
    has to abandon the sorting based encoding of Tamil in favour of
    transliteration based encoding.
    While Devanagari has the upperhand in keeping it's sorting based encoding
    all other Indic languages were encoded transliteration based encoding. The
    saddest thing is that the transliteration based encoding can not operate as
    planned as there never is a one to one mapping between languages.
    For example Tamil K will indicate k, h, g, q, x and other related phoneme
    while Devanagari would have individual character shapes representing
    individual phonemes. Tamil is based on Alphabet based phonemic system, while
    Devanagari is based on phonemic system.
    If Unicode changes it's policy from the unimportant and non functioning
    transliteration based encoding to one of natural sorting based encoding
    would be a superior solution.
    However, expecting Unicode to change it's encoding philosophy of ISCII based
    transliteration encoding to one of natural sorting based encoding is not
    going to be easy.
    We will need to work on what is imposed on Tamil and find software solutions
    to resolve sorting requirements.

    Tamil Grammar, probably the worlds oldest written and a sophisticated
    Grammar, clearly defines authography for Tamil. Here again Unicode does not
    seem to beleive that a language can have Grammar defining it's authography.
    In this regard it is not too late to bring to the attention of Unicode
    consortium that how authography is defined and how sorting is used.

    We will analise the requirements to be able to collate Tamil, by ways of
    software fixes.

    To be continued....

