From: Kent Karlsson (kentk@cs.chalmers.se)
Date: Thu Aug 07 2003 - 12:32:47 EDT
> > Anyway, John J, what code are we talking about that has to
> work from
> > the positions of the combining marks back to the underlying
> > representation? Are you talking about OCR?
> >
>
> No, the issue is more how to start from a base form and work
> forward to
> encompass the whole series of characters which need to be treated "as
> one" in certain processes, which can include cursor movement, hit
> testing, display, line breaking, collation, normalization.
Collation isn't really based on combining sequences (even though UTS 10
specifies a certain "spanning" over non-blocking (combining)
characters).
Note in particular the following entry in the CTT (and with different
syntax in the UTS 10 tables):
<U0E4D_0E32> <S0E33>;<BASE>;<MIN>;<U0E33> % THAI CHARACTER SARA AM
(and a similar one for Lao). This is a collation entry for a
"contraction" of a combining mark followed(!) by (formally) a
base character. (I'm not really sure what the true logical sequence
would be, though.)
/kent k
This archive was generated by hypermail 2.1.5 : Thu Aug 07 2003 - 13:09:31 EDT