RE: Open Issue #61: Proposed Update UAX #15 Unicode Normalization Forms

From: Shawn Steele (
Date: Wed Jan 26 2005 - 18:38:37 CST

  • Next message: Christopher Fynn: "Re: Open Issue #61: Proposed Update UAX #15 Unicode Normalization Forms"

    "Simon" said:

    > There is deployed code and standards that use the old interpretation.
    There is deployed code that use both of the interpretations.

    > StringPrep, and IDN, will continue to use the old interpretation,
    > until they are updated to reference this update. There are no draft
    > documents on that, as far as I know.
    As far as I know (I could be wrong), StringPrep & IDN don't specify
    which interpretation of the UAX are "correct" for those RFCs. Besides,
    these are not linguistically correct code points so names shouldn't
    really contain them. Additionally IDN requires that
    ToAscii(ToUnicode(x)) == x, which pretty much causes NFKC(x) == x
    (ToAscii does the NFKC step and x should already be NFKC.) So any name
    that would be broken by this clarification would be illegal anyway in

    > I'd wish that this was only about punishing people that came to the
    > "wrong conclusion". I believe the previous situation was perfectly
    > clear, even if that situation is problematic, in that the introduction
    > text and example code were buggy. It seems to me that one problematic
    > situation is solved by creating other problems.

    Its obvious that the text disagreed with itself and the sample. Where
    the bug is seems to be somewhat subjective, however the NFKC(NFKC(x)) ==
    NFKC(x) is obviously desirable and was explicitly stated in the text.
    It is unfortunate that this test case wasn't included in the test file

    Anyway, this has been well discussed already, and either way would
    require some people to fix their code, so I wouldn't try to argue
    against the update :-)

    - Shawn

    Shawn Steele
    SDE, Microsoft

    This archive was generated by hypermail 2.1.5 : Wed Jan 26 2005 - 18:39:11 CST