Re: NFC

From: Markus Scherer (markus.icu@gmail.com)
Date: Wed Feb 01 2006 - 11:09:51 CST

Next message: Werner LEMBERG: "Re: Musical symbols"

Previous message: Tim Greenwood: "Re: NFC"
In reply to: Tim Greenwood: "Re: NFC"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

I think it depends on what Tim needs to do.

If he needs to look at a single character and see if it's "inert"
under one of the normalization forms, then an analysis like Mark
suggests is best. (ICU implements this as well.)

For checking if a *string* is in fact normalized according to some
form, it can be simpler. In ICU, when I hit a qc_maybe value (which
can only happen for NFC or NFKC, not NF*D), I take the smallest
surrounding segment between starters, normalize that segment, and see
if it's the same as the original. A starter in this sense has (qc_yes
&& ccc==0), or it decomposes and the first resulting character
fulfills this condition.

markus

--
Opinions expressed here may not reflect my company's positions unless
otherwise noted.

Next message: Werner LEMBERG: "Re: Musical symbols"
Previous message: Tim Greenwood: "Re: NFC"
In reply to: Tim Greenwood: "Re: NFC"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Wed Feb 01 2006 - 11:16:01 CST