Re: Tamil Text Messaging in Mobile Phones

From: James Kass (
Date: Fri Jul 26 2002 - 10:21:40 EDT

Michael Kaplan wrote,

> > The changes advocated seem to be more related to the Tamil script
> > itself rather than the way that it is encoded.
> The changes for "Linear Tamil" are to leave the encoding exactly the way
> they have but to change all of the rules for re-ordering such that the
> ordering for encoding is much closer to visual.

This is approximately my understanding. But, I'd say " leave
the encoding exactly the way they have but to change all the rules
for writing the text so that when a spoken vowel follows a spoken
consonant, the respective written vowel letter follows the written
consonant. (Coincidentally, this matches the current way Tamil is
encoded in Unicode.) In which case, the rules for re-ordering
wouldn't be changed, they'd be disabled whenever text is displayed
in the reform script.
> Text that is valid in the current encoding of Tamil is not valid using
> Linear Tamil.

As long as automatic re-ordering can be disabled, which could be
done very simply on one platform with a new script tag in the
OpenType registry and a tiny tweak to the Uniscribe, this isn't
the case.

Indeed, Uniscribe support only works on OpenType fonts. The
vowels don't get re-ordered on a mere TrueType font. It should be
currently possible to view Unicode reformed Tamil text with a
TrueType font containing revised glyphs properly encoded,
because the default re-ordering won't occur.

This was mentioned on the OpenType list, IIRC. The reason that
simply using TrueType isn't a permanent or acceptable solution
is because OpenType offers desired advanced typographic
features over-and-above ligature substitution.

> > Tamil's encoding is called logical encoding, as opposed to visual
> encoding.
> > The reform proposals seem to point towards writing Tamil logically,
> > in this sense meaning in the same order that it's encoded.
> Actually, you have it backwards. "Linear Hebrew" would actually be *Visual*
> Hebrew, with the current encoding being logical.

I only got into this because I didn't want anyone to think that
we were implying that Tamil writing was currently illogical
because of an unfortunate choice in jargon.

<semantic rant>
(To me, any encoding which isn't visual isn't logical. And, so-
called visual Hebrew isn't visual at all, it's backwards because
when you scan Hebrew with your vision, your eyes are
supposed to be travelling RTL. When Hebrew is written,
it is written RTL. When it is visually encoded, it should
be encoded in the same order as it is written or seen.)
</semantic rant>

> > Problems with displaying reformed Tamil text encoded in Unicode
> > aren't related to the encoding itself; the encoding is fine. Problems
> > arise when default operating system handling re-orders certain
> > combinations, which is expected behaviour in traditional Tamil text,
> > but is completely unwanted under the reform.
> 1) The suggestion of changes to the script are not relevant to the linear
> Tamil issue since one will not make the other any easier or harder -- they
> are independent issues

The changes to the script are relevant to the linear Tamil issue
because the changes to the script include the notion that Tamil
is to be written linearly. The changes (modernizations) to some
of the glyphs are not relevant to the linear Tamil issue since
one will not make the other any easier or harder.
> 2) script reform is beyond the scope of both Unicode and INFITT's WG02.

And rightly so.

> And this is defined where? Unicode defines rules here, a choice of a font to
> ignore those rules is an interesting bow to compatibility with systems not
> smart enough to handle complex scripts, but how can it be Unicode when it
> happens, any more than Visual Hebrew is Unicode?

A script's rules are defined by the users of the script. It is not
up to a committee to decide how someone else's language is written.
Unicode codifies existing script rules. If those rules are changed
by the users, Unicode's gotta roll with the punches.

(Visual Hebrew isn't Unicode, but it is edocinU !)

> > Reform means change, though, and change implies that updates
> > will be necessary. If a property of a character changes through
> > popular use, then it's up to standards organizations to accomodate
> > the change.
> And when changes that are recognized by anyone are made, such things can
> happen -- none of which relates to whether the encoding form of Tamil is
> "linear" or not.

I'm lost.

> > Acceptance of any script reform is up to its users.
> And an appropriate forum in which to discuss reform? Something that is
> always complicated when a script is used by more than one country, and
> further complicated by the fact that its association with "Linear Tamil" is
> not really a true association.

Unicode's public e-mail list is not an appropriate forum for
reforming Tamil (or any other) script. It is appropriate to
discuss how reforms in general or for a specific script will
impact the standard. It should be appropriate for anyone
contemplating script reform to ask questions here about
the standard in order to help minimize that impact.

My impression is that this is what Sinnathurai Srivas has tried
to do.

Best regards,

James Kass.

This archive was generated by hypermail 2.1.2 : Fri Jul 26 2002 - 08:20:02 EDT