Re: Merging combining classes, was: New contribution N2676

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Mon Oct 27 2003 - 18:39:56 CST


From: "Peter Kirk" <peterkirk@qaya.org>

> On 27/10/2003 12:28, Mark Davis wrote:
>
> >Collation is very different, and already has mechanisms for dealing with
> >sequences. So no CGJ is needed there (except for case 2).
> >
> >Mark
> >
> >
> >
> Mark, can you outline what these mechanisms are or point me to a
> definition e.g. in a section of UTR #10? As I had understood it, the
> only way to deal with sequences of the sort I have in mind is to list
> each possible individually as a contraction. The Logical_Order_Exception
> property (see http://www.unicode.org/reports/tr10/ section 3.1.3) just
> might be useful, but doesn't seem to have the necessary flexibility as
> it causes a character to be swapped with ANY following character, not
> just with any of a restricted list of such characters. The backwards
> marking used for French accents (section 3.1.2) seems to apply over too
> long a string.

The backwards marking is not restricted to French accents in collation
level 2. You can use reverse ordering at any tailored level to fit other
needs, and you can also insert an extra collation level.

So I think that Mark is right here as it gives you full control on the
length
of the collating sequence at each level of the collation keys. The case 2
is effectively an exception.

The bad thing is that the current default UCA ordering table does not create
such collation keys with intermediate levels for Hebrew vowels, and you
need tailoring to create a base level with consonnants, one level with
vowels, a third level for sin/shin dots, a fourth for meteg, a fifth for
accents...
unless the text is encoded in logical order using the CCO-convention.

Philippe.



This archive was generated by hypermail 2.1.5 : Thu Jan 18 2007 - 15:54:25 CST