Re: NFC

From: Mark Davis (mark.davis@icu-project.org)
Date: Wed Feb 01 2006 - 10:11:57 CST

Next message: Tim Greenwood: "Re: NFC"

Previous message: Jon Hanna: "Re: NFC"
In reply to: Tim Greenwood: "NFC"
Next in thread: Tim Greenwood: "Re: NFC"
Reply: Tim Greenwood: "Re: NFC"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

No, that's not sufficient; there are some edge cases. In ICU we
preprocess and store a number of pieces of data that are very useful in
optimizing normalization, such as:
a) those characters that can't combine or reorder with anything in front
of them
b) those characters that can't combine or reorder with anything behind them
c) if a character were to be decomposed, what would the first ccc be,
and what would the last
and so on.

If you run into a maybe character, then you can use the above
information plus other UCD properties to find the minimal span that you
need to worry about. (A completely stable character under NFC will be
both (a) and (b), but you can do a somewhat better job if you have both
pieces of information.)

Mark

Tim Greenwood wrote:

>Annex 8 of UAX #15 (Normalization Forms) describes the quick lookup
>property of Yes/No/Maybe for determining if a string is NFC. When I
>get a 'Maybe' is it sufficient to do the fuller analysis from the
>previous 'Yes' character? In other words (I think) is the previous
>'yes' character a stable NFC code point? From the annex it seems to be
>not, but I cannot think of an example.
>
>Can anyone provide an example where I would get a stream of 'Yes'
>followed by a 'Maybe' where the fuller analysis needs to start before
>the previous 'Yes'
>
>Thanks
>Tim
>
>
>
>
>
>
>

Next message: Tim Greenwood: "Re: NFC"
Previous message: Jon Hanna: "Re: NFC"
In reply to: Tim Greenwood: "NFC"
Next in thread: Tim Greenwood: "Re: NFC"
Reply: Tim Greenwood: "Re: NFC"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Wed Feb 01 2006 - 10:17:48 CST