Re: Tamil Collation vs Transliteration/Transcription Enc

From: Antoine Leca (
Date: Mon Jun 27 2005 - 02:47:39 CDT

  • Next message: Antoine Leca: "Re: Tamil sha (U+0BB6) - deprecate it?"

    On Saturday, June 25th, 2005 23:55Z James Kass wrote:

    > Michael Everson wrote,
    >> The Unicode encoding is based on ISCII, not transliteration.

    As far as I know, in ISCII there are shared codepoints; in Unicode, there
    are differentiated.

    So while ISCII implements "automatic translitteration" (just change the
    reference script), Unicode went one step farther and did replicate the
    letters, thus opened a way to translitterate using shifts (and I am not
    aware this is commonly implemented because of the many corner cases; only
    between Nagari and Gujarati, or between Telugu and Kannada could it be
    easily done), but certainly does not require it.

    Another important point is that ISCII order is not innocent (neither is
    ASCII), but it is not the "obvious" order either, it has been tailored
    toward Hindi collation order (hence candrabindu before bindu before a, while
    traditional Sanskrit order has a+bindu after au).
    It is the same as us poor Spanishs which are empeached to have our beloved
    in the correct place by the combined Franco-English imperialism ;-).

    >> Brahmic scripts all have the same structure, Tamil included, though
    >> Tamil lost some of the original Brahmic letters.
    >> The encoding is based on ISCII, not transliteration.
    > If Unicode is based on ISCII and ISCII is based on transliteration,
    > then Unicode is transliteration-based with respect to Indic script
    > encoding.

    I disagree. Translitteration is not a transitive process. For exemple, if
    you translitterate Greek into Cyrillic, and you translitterate the result to
    Latin, you will end up with something pretty strange, that people will not
    accept as translitteration.
    Even if both translitterations are perfectly reversible;
    and in the Indic/ISCII/Unicode case, these translitterations are not
    perfect, in general.


    This archive was generated by hypermail 2.1.5 : Mon Jun 27 2005 - 02:48:26 CDT