Re: Merging combining classes, was: New contribution N2676

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Mon Oct 27 2003 - 20:06:52 CST

Next message: Doug Ewell: "Re: Unicode and Script Encoding Initiative in San Jose Mercury News"
Previous message: Philippe Verdy: "Re: Merging combining classes, was: New contribution N2676"
In reply to: Peter Kirk: "Re: Merging combining classes, was: New contribution N2676"
Next in thread: Peter Kirk: "Re: Merging combining classes, was: New contribution N2676"
Reply: Peter Kirk: "Re: Merging combining classes, was: New contribution N2676"
Reply: Kent Karlsson: "RE: Merging combining classes, was: New contribution N2676"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

> Thanks for the clarification. In principle we might be able to go a
> little further: we could define both <c, CCO> and <CCO, c> as
> canonically equivalent to c for all c in combining class zero. This
> would have to be some kind of decomposition exception so that c is never
> decomposed by adding CCO before or after it. This would not remove CCO
> between two combining characters, so, if 0<c1<c2, <c1, c2> and <c1, CCO,
> c2> would remain not canonically equivalent while logically equivalent.
> In practice this would be a small price to pay as it is relevant only in
> the almost unique case of two vowels on one consonant which actually
> happen to be in canonical order.

Why that?

As CCO is not defined in any past versions, the stability pact does
not say that we must forbid its _removal_ when computing NFC or NFD
or NFKC or NFKD forms. It just says that we must _not insert_ it in a
source string <c1, c2> where c1 and c2 are already assigned.

So we are fine: we can define a canonical equivalence between
<c1, CCO, c2> and <c1, c2> where the later is simultaneously in
NFC, NFD, NFKC and NFKD forms, for all (c1, c2) pair such that
CC(c1)<=CC(c2) or CC(c2)=0.

But we cannot define it within the UCD, but algorithmically, like for
Hangul syllables/jamos...

Next message: Doug Ewell: "Re: Unicode and Script Encoding Initiative in San Jose Mercury News"
Previous message: Philippe Verdy: "Re: Merging combining classes, was: New contribution N2676"
In reply to: Peter Kirk: "Re: Merging combining classes, was: New contribution N2676"
Next in thread: Peter Kirk: "Re: Merging combining classes, was: New contribution N2676"
Reply: Peter Kirk: "Re: Merging combining classes, was: New contribution N2676"
Reply: Kent Karlsson: "RE: Merging combining classes, was: New contribution N2676"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Thu Jan 18 2007 - 15:54:25 CST