Re: Tamil Text Messaging in Mobile Phones

From: James Kass (
Date: Wed Jul 31 2002 - 04:52:43 EDT

Dear Sinnathurai Srivas,

Is this a graphic showing the experimental diacritics you mention?

If so, it should be possible for most of these to be encoded in text
as pronunciation indicators using existing Unicode characters.

Glyph - Unicode
 No. Poss.
0152 U+0309 *
0094 U+0302
0153 U+0303
0154 U+0308
0155 U+0304
0156 U+031A
0134 U+0325
0096 U+E6FD **
0135 U+0339
0136 U+02E9 ***
0137 U+????
0138 U+2321 ***
0139 U+2218 ***

* The reference glyph in the standard is reversed. But, the
    reference glyphs are only informative; the actual glyph
     shapes are up to the font developer.

** The stroke in Phaistos symbols in ConScript PUA encoding is
     the closest I could find.

***These characters were selected only for their appearance.

With the above in mind, here's an attempt to encode part of the
examples in the graphic linked above in Unicode (UTF-8):

அ̉ அ̂ அ̃ அ̈ அ̄ அ̚
க̥ க க̹ க˩ க? க⌡ க∘

Admittedly, the display of the above here is less than optimal,
but this is a font/display issue rather than an encoding issue.
(At least no dotted circles are appearing here in the display.)

As Peter Constable wrote recently in reply to Keld Jørn Simonsen:
>>My point has been that that language community would be much
>>better served by dropping the idea of using "@" in this way and picking
>>something else since, as suggested in your comment, 10646 has lots to
>>choose from.

And, this is a good point. There are many existing characters in
Unicode from which to choose, not only for orthographies, but
even for pronunciation symbols.

William Overington and Martin Kochanski independently suggested
that the Private Use Area would be well-suited for any experimental
characters, in case some forms can't already be found, or existing
Unicode forms are not acceptable. The PUA remains an option for
the pronouncing glyphs.

Font replacement by the system sometimes solves problems and
sometimes makes problems. Arbitrary font-switching is not an
encoding issue, but one way to avoid it is to use applications
which either don't do it, or allow the user to disable it.

Best regards,

James Kass.

----- Original Message -----
From: "Sinnathurai Srivas" <>
To: <>
Sent: Tuesday, July 30, 2002 7:44 AM
Subject: Re: Tamil Text Messaging in Mobile Phones

Dear James Kass,

For a pronounciation Dictionary, a set of diacritics that is in experiment
need to be included


when this additional (diacritics) occur in text, OS should not decide some
thing is wrong with grammar and substitue with dotted circles or assumes the
font is faulty and replaces with another font which does not know anything
about this additional diacritics used.

Sinnathurai Srivas

This archive was generated by hypermail 2.1.2 : Wed Jul 31 2002 - 02:54:43 EDT