From: Peter Kirk (peterkirk@qaya.org)
Date: Thu Sep 25 2003 - 06:08:47 EDT
On 24/09/2003 14:58, Jon Hanna wrote:
>... For example since following the decomposition <U+0104> -> <U+0041, U+0328> there can be no character that is unblocked from the U+0041 that will combine with it, hence there is no circumstance in which they will not be recombined to U+0104 and hence dropping that decomposition from the data will not affect NFC (the relevant data would still have to be in the composition table, as the sequence <U+0041, U+0328> might occur in the source code).
>
>
>
Is this actually correct? For example, if I have in my data the string
<U+0104, U+05B0> (which I know is garbage, but that is irrelevant), that
will decompose and reorder to <U+0041, U+05B0, U+0328>, as U+05B0 has a
higher combining class (202) than U+05B0 (10). What does this become in
NFC? Is the reordering reversed and the combination reapplied?
This is not only a theoretical issue as the same applies to some real
combinations. There was discussion only last week on the bidi list of a
form which might be encoded <U+064A, U+0652, U+0654> but which would be
messed up if composed into <U+0626, U+0652>.
-- Peter Kirk peter@qaya.org (personal) peterkirk@qaya.org (work) http://www.qaya.org/
This archive was generated by hypermail 2.1.5 : Thu Sep 25 2003 - 06:58:03 EDT