Re: Major Defect in Combining Classes of Tibetan Vowels

From: Christopher John Fynn (cfynn@gmx.net)
Date: Fri Jun 27 2003 - 16:37:32 EDT

Next message: Philippe Verdy: "Re: Biblical Hebrew"

Previous message: Karljürgen Feuerherm: "Re: Biblical Hebrew"
In reply to: Christopher John Fynn: "Re: Major Defect in Combining Classes of Tibetan Vowels: Illustration"
Next in thread: Peter_Constable@sil.org: "Re: Major Defect in Combining Classes of Tibetan Vowels"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Rick McGowan <rick@unicode.org> has privately suggested moving
the discussion of Combining Classes of *Tibetan* Characters
from the main Unicode list unicode@unicode.org to the TIBEX list
tibex@unicode.org - an "experts" list which was set up several
years ago specifically to discuss proposals for encoding Tibetan
characters in Unicode. If there are people who have a
particular interest in Tibetan characters and have been
following the thread here who would like to continue following
this thread - perhaps they could ask Rick how they can join that
list.

I'll follow Rick's advice - perhaps this discussion is more
appropriate on the TIBEX list - even though similar issues with
some Hebrew characters which have been raised here (again) as a
result of this thread makes me think there may be a need for a
non script specific solution or work-around to problems with
cannoical combining class values.

Anyway I'm going to move this discussion over there with a
parting shot...

Off-list Robert Chilton has pointed out to me the following:

> 3. A very common occasion of 0F7E occurring with a vowel is in
the stack
> HaUm (orthographic sequence of 0F67 0F71 0F74 0F7E). Because
0F7E is
> currently assigned a cc of zero, this *same glyph-form* could
> theoretically be encoded with a total of 6 different character
> sequences, resulting in 4(!) different sequences following
> normalization. Properly, all 6 sequences should normalize to
the same
> sequence -- which is indeed the case if 0F82 or 0F83 is used
in place of
> 0F7E. Obviously a major problem, not only for rendering but
also for
> searching and sorting.

FOUR different sequences possible *after* "normalisation" ???

Personally I would have rather seen all Tibetan characters
having a CCV of 0 (and all pre-combined Tibetan characters
*strongly* depreciated)rather than this. If someone simply
follows the normal rules for writing Tibetan, then characters
will be entered in a very predictable order which is far easier
to process than the one(s) they can end up in after Unicode
"normalisation".

- Chris Fynn

BTW My apologies to anyone who receives two copies of this
message.

Next message: Philippe Verdy: "Re: Biblical Hebrew"
Previous message: Karljürgen Feuerherm: "Re: Biblical Hebrew"
In reply to: Christopher John Fynn: "Re: Major Defect in Combining Classes of Tibetan Vowels: Illustration"
Next in thread: Peter_Constable@sil.org: "Re: Major Defect in Combining Classes of Tibetan Vowels"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Fri Jun 27 2003 - 17:14:43 EDT