From: Markus Scherer (markus.icu@gmail.com)
Date: Wed Feb 01 2006 - 11:09:51 CST
I think it depends on what Tim needs to do.
If he needs to look at a single character and see if it's "inert"
under one of the normalization forms, then an analysis like Mark
suggests is best. (ICU implements this as well.)
For checking if a *string* is in fact normalized according to some
form, it can be simpler. In ICU, when I hit a qc_maybe value (which
can only happen for NFC or NFKC, not NF*D), I take the smallest
surrounding segment between starters, normalize that segment, and see
if it's the same as the original. A starter in this sense has (qc_yes
&& ccc==0), or it decomposes and the first resulting character
fulfills this condition.
markus
-- Opinions expressed here may not reflect my company's positions unless otherwise noted.
This archive was generated by hypermail 2.1.5 : Wed Feb 01 2006 - 11:16:01 CST