RE: Normalization in panlingual application

From: Kenneth Whistler (kenw@sybase.com)
Date: Fri Sep 21 2007 - 13:57:58 CDT

  • Next message: Andy Heninger: "Re: New Public Review Issue: Proposed Update UTS #18"

    Philippe Verdy said:

    > I know that. But I did not discuss about NFC/NFD. Only about NFKC/NFKD that
    > was not designed for interoperable interchange purpose.

    I've let a number of things in this thread pass... but this
    claim is simply flat-out false.

    *All* normalizations forms defined by UAX #15 are designed for
    interoperable interchange. If they weren't, the UTC wouldn't
    have bothered to spend the time specifying them exactly in
    the UAX in the first place, nor defending the stability guarantees
    for all forms of Unicode normalization.

    > The fact that IDN makes some use of it (now in a non-conforming way because
    > it uses its own rules to define its own sets of mappings, and to preserve
    > compatibility with future evolutions, it does not automatically integrate
    > all Unicode additions) is another problem,

    And this claim about IDN is also false.

    IDN depends on NamePrep (RFC 3491), which specifies the use of
    Unicode normalization form KC, as described in StringPrep (RFC 3454).
    And StringPrep clearly requires the conformant use: "If a profile
    is going to use a Unicode normalization, it MUST use Unicode
    normalization form KC."

    The fact that the output of NamePrep is not the same as simply
    normalizing a string with NFKC is beside the point. Of course
    it is different, because NamePrep specifies the use of various
    mappings and character prohibitions in addition to NFKC
    normalization. But it is misleading and false to claim that
    IDN makes use of NFKC "in a non-conforming way".

    > but anyway IDN is not a Unicode
    > specification and its correct implementation is not mandatory for Unicode
    > conformance.

    That much is correct.

    --Ken



    This archive was generated by hypermail 2.1.5 : Fri Sep 21 2007 - 14:01:45 CDT