From: Kenneth Whistler (kenw@sybase.com)
Date: Tue Aug 05 2003 - 21:06:00 EDT
Kent Karlsson said:
> I see no particular *technical* problem with using WJ, though. In
> contrast
> to the suggestion of using CGJ (re. another problem) anywhere else but
> at the end of a combining sequence. CGJ has combining class 0, despite
> being invisible and not ("visually") interfering with any other
> combining
> mark. Using CGJ at a non-final position in a combining sequence puts
> in doubt the entire idea with combining classes and normal forms.
Why? There are any number of combining characters with combining
class 0, including the vast majority of Indic dependent vowels,
for instance.
A combining character sequence is a base character followed
by any number of combining characters. There is no constraint
in that definition that the combining characters have to
have non-zero combining class.
Canonical reordering is scoped to stop at combining class = 0.
It doesn't say that it applies to combining character sequences
per se. It applies to *decomposed* character sequences
(meaning, effectively, any sequence which has had the recursive
application of the decomposition mappings done).
Take a Myanmar example: /kau/:
character sequence: <1000, 1031, 102C, 1039, 200C>
combining?: no yes yes yes no
combining classes: 0 0 0 9 0
comb char sequence: ----------------------
canon reorder scope: ---| ---| ---------| ---|
The combining character sequence here is: <1000, 1031, 102C, 1039>
The *syllable* consists of that plus the trailing ZWNJ.
But the relevant sequences for application of the
canonical reordering algorithm are each sequence starting
with combining class zero and continuing through any
sequence with combining class not zero.
I don't see how introduction of CGJ into such sequences calls
any of the definitions or algorithms into question.
--Ken
This archive was generated by hypermail 2.1.5 : Tue Aug 05 2003 - 21:52:08 EDT