Some comments below.
◄ “Eppur si muove” ►
----- Original Message -----
From: "Samphan Raruenrom" <email@example.com>
To: "Asmus Freytag" <firstname.lastname@example.org>
Cc: "Sreedhar M" <email@example.com>; <firstname.lastname@example.org>; "Rick
Sent: Tuesday, July 16, 2002 07:22
Subject: Re: Is UniCode's Thai character representation is acceptable
by TISI or not?
> Asmus Freytag wrote:
> > At 12:06 PM 7/16/02 +0700, Samphan Raruenrom wrote:
> >> There're some mistakes in Unicode char.
> >> properties for Thai char. and you have to "code around" that.
> > And the mistakes are?
> I've discussed a few of them here in this list. I'll write
> a more formal report on the issue later. Here're some titles
> Problems from Unicode properties
> - error in combining class of vowel signs make normalization
> in some cases. This is important if you want to compare strings.
Meaning: the normalized forms of two strings are not equal in cases
where Thais would consider them equal, right?
> - decomposition of SARA AM add more problem to normalization
I don't recall seeing that note; I'll look forward to your report.
> - some properties make grapheme cluster for Thai
> imcompatible with the way Thai expect, e.g PINTHU as
> virama, SARA AM not a combining character
In the last UTC, action was taken that is not yet in the draft TR on
boundaries. In particular, this affects Thai.
> Inaccuracy in the Unicode book
> - backspace 'always' use the same (grapheme cluster) character
> as Del and left/right arrow. Actually Thai use backspace to
> character not the whole cluster. So character boundary for
> should be locale specific.
This text will be overriden by the TR.
> - in Thai, zero width space is said to be able to expand in
> paragraph. Actually it is always zero width.
There may be some misunderstanding here. What is meant is: if you had
the sequence ABCD, and between the B and the C was a zero-width space,
AND you were inter-character spacing for justification, you would not
expect to see:
A BC D
Instead, you would expect to see
A B C D
That is, the zero-width space does not prevent the characters from
using inter-character spacing.
> These are things you have to khow after learning the Unicode
> if you plan to work with Thai language, to 'code around' the problem
> to make it acceptable for Thai people.
> I plan to write a formal report on the issue, not to change the
> but to note what is wrong and what have to be code around. So people
> who like to work with Thai language (like you) will know the right
> to do and not repeat the same mistake as in some softwares.
> Samphan Raruenrom
> Information Research and Development Division,
> National Electronics and Computer Technology Center, Thailand.
This archive was generated by hypermail 2.1.2 : Tue Jul 16 2002 - 09:53:28 EDT