Re: Tamil Text Messaging in Mobile Phones

From: James Kass (
Date: Fri Jul 26 2002 - 08:59:50 EDT

Martin Kochanski wrote,

> Isn't this sort if thing *exactly* what the private use
> area is for?

The PUA can certainly be used for trial runs and experiments,
including script reforms where appropriate.

> There aren't that many mobile phone manufacturers, and they
> should be able to agree what SMP PUA codes to use: mobile phones
> are pretty closed systems. Or am I missing something here?

They should even be able to agree on some BMP PUA code points,
which is possibly what you meant. PUA code points, as long as
exchanged between consenting parties, aren't just limited to
mobile phone use, either.

This is speaking generally of PUA and experiments. In the case
of the proposed reform for Tamil, there are some characters
considered which aren't presently covered by Unicode such as
currency symbols. This type of addition would need to go
through regular channels prior to becomming official, and
the PUA is an option.

> >Do we discourage people from altering their own scripts? Should we?
> Yes, you do; and yes, you should - if by "altering scripts" you mean
> altering the meanings of existing code points,
> ( ... )

So, if that isn't what is meant, would the answer be: No, we don't and
no, we shouldn't? In this case for reformed Tamil, it is my understanding
that no one is suggesting that the semantics of the existing encoded
abstract characters should be changed. There will always be a need
to encode, transmit, store, and display traditional Tamil text; much
like there is a need to do the same with traditional Chinese text.

Other than some new symbols, existing encoded Tamil text should be
able to be displayed as either traditional or reformed style text without
converting any data or changing anything about the meaning of the

> ( ... )
> or even the way that they are to be rendered.

The appearance of glyphs is determined by type designers, so if the
appearance of a letter is modernized, it's compliant to produce a
modernized font.

So far, it seems we pretty much agree, if my understanding of
the issues involved is correct.

This brings us to the character properties which would affect
the re-ordering aspect of rendering.

Suppose that an existing RTL script became LTR. (Probably
not going to happen, but just suppose.) Would it be better to
re-encode all of, say, Arabic abstract characters as new
characters, or would it be better to adjust a property?

> As a software publisher, I would argue that the rendering and
> behaviour of a given Unicode code point should *never* change:
> literally never, even if the script is long dead, no-one can read
> it, and the glyph has acquired an offensive meaning, like the
> once innocent swastika. I want people to be still using our software
> in 20 years' time without the need for constant updates.

If you'd consider changing the phrase "rendering and behaviour" in
your paragraph above to "semantics or meaning" and dropping the
part about "constant update", I'd be pleased to stand by your side
and help you argue.

For rendering and behaviour, nobody should want to see existing
implementations broken. It may become necessary to modify
some aspects of the standard from time-to-time and we should
look for ways to accomplish this without breakage.

On constant update, Unicode is an growing standard and existing
scripts do sometimes change. Updates would seem to be required
in order to stay up-to-date.

Best regards,

James Kass.

This archive was generated by hypermail 2.1.2 : Fri Jul 26 2002 - 06:59:10 EDT