Re: Unicode Normalisaton Optimisation Experiments

From: Peter Kirk (peterkirk@qaya.org)
Date: Thu Sep 25 2003 - 06:08:47 EDT

Next message: jon@spin.ie: "Re: need help understanding diacritical encoding"

Previous message: jon@spin.ie: "Re: Unicode Normalisaton Optimisation Experiments"
In reply to: Jon Hanna: "Unicode Normalisaton Optimisation Experiments"
Next in thread: jon@spin.ie: "Re: Unicode Normalisaton Optimisation Experiments"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

On 24/09/2003 14:58, Jon Hanna wrote:

>... For example since following the decomposition <U+0104> -> <U+0041, U+0328> there can be no character that is unblocked from the U+0041 that will combine with it, hence there is no circumstance in which they will not be recombined to U+0104 and hence dropping that decomposition from the data will not affect NFC (the relevant data would still have to be in the composition table, as the sequence <U+0041, U+0328> might occur in the source code).
>
>
>
Is this actually correct? For example, if I have in my data the string
<U+0104, U+05B0> (which I know is garbage, but that is irrelevant), that
will decompose and reorder to <U+0041, U+05B0, U+0328>, as U+05B0 has a
higher combining class (202) than U+05B0 (10). What does this become in
NFC? Is the reordering reversed and the combination reapplied?

This is not only a theoretical issue as the same applies to some real
combinations. There was discussion only last week on the bidi list of a
form which might be encoded <U+064A, U+0652, U+0654> but which would be
messed up if composed into <U+0626, U+0652>.

-- 
Peter Kirk
peter@qaya.org (personal)
peterkirk@qaya.org (work)
http://www.qaya.org/

Next message: jon@spin.ie: "Re: need help understanding diacritical encoding"
Previous message: jon@spin.ie: "Re: Unicode Normalisaton Optimisation Experiments"
In reply to: Jon Hanna: "Unicode Normalisaton Optimisation Experiments"
Next in thread: jon@spin.ie: "Re: Unicode Normalisaton Optimisation Experiments"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Thu Sep 25 2003 - 06:58:03 EDT