RE: Tamil 0BB3 and 0BD7

From: Kenneth Whistler (
Date: Mon Nov 10 2003 - 20:39:12 EST

  • Next message: "Re: Ciphers (Was: Berber/Tifinagh)"

    Peter Jacobi noted:

    > but it would still hold, that:
    > U+0B95 U+0BC6 U+0BB3 and
    > U+0B95 U+0BCC
    > are indistinguishable in written Tamil.

    This is a true ambiguity in the writing system.

    <U+0B95, U+0BC6, U+0BB3> ==> ke-l.a

    <U+0B95, U+0BCC> ==> kau

    Every analysis of Tamil that I see distinguishes the two
    letters, l.a versus -au, even though there is an overlap
    in glyph form, so it is clear that encoding them distinctly
    makes sense, even though they participate in the visual
    ambiguity cited above.

    However, there is another graphological reason for the
    distinction. The -au character is a dependent vowel.
    You can't add other vowels such as -ii (U+0BC0) or -u (U+0BC1)
    to the rightmost glyph part of -au (the one that *looks*
    like the l.a consonant). But you *can* add -ii or -u to
    U+0BB3 l.a, so there is a clear difference in distribution
    and interaction with other characters.

    Finally, for ISCII interoperability, there was no choice
    but separate encoding (not of the length mark per se, of
    course, but of -au versus l.a).


    This archive was generated by hypermail 2.1.5 : Mon Nov 10 2003 - 21:20:06 EST