Re: NFC

From: Markus Scherer (markus.icu@gmail.com)
Date: Wed Feb 01 2006 - 11:09:51 CST

  • Next message: Werner LEMBERG: "Re: Musical symbols"

    I think it depends on what Tim needs to do.

    If he needs to look at a single character and see if it's "inert"
    under one of the normalization forms, then an analysis like Mark
    suggests is best. (ICU implements this as well.)

    For checking if a *string* is in fact normalized according to some
    form, it can be simpler. In ICU, when I hit a qc_maybe value (which
    can only happen for NFC or NFKC, not NF*D), I take the smallest
    surrounding segment between starters, normalize that segment, and see
    if it's the same as the original. A starter in this sense has (qc_yes
    && ccc==0), or it decomposes and the first resulting character
    fulfills this condition.

    markus

    --
    Opinions expressed here may not reflect my company's positions unless
    otherwise noted.
    


    This archive was generated by hypermail 2.1.5 : Wed Feb 01 2006 - 11:16:01 CST