Normalization (was: Re: Hebrew combining classes)

From: Doug Ewell (dewell@adelphia.net)
Date: Mon Jan 17 2005 - 00:05:07 CST

  • Next message: Philippe Verdy: "Re: [hebrew] Re: Hebrew combining classes (was ISO 10646 compliance and EU law)"

    Philippe Verdy <verdy underscore p at wanadoo dot fr> wrote:

    >> Isn't that what Peter said? If you don't care about standard
    >> normalization forms, you don't care about canonical equivalence.
    >
    > This was a different point. it just spoke about alternate
    > normalization forms, but in my opinion, a transformation based on
    > "alternative combining classes" should not be named "normalization" if
    > it does not preserve canonical equivalence.
    >
    > My opinion is just weakened by the fact that Unicode also speaks about
    > "normalization" when refering to NFKC and NFKD forms, despite they
    > don't preserve the canonical equivalence.

    For better or worse, "normalization" is a fairly generic term, and its
    meaning is not restricted by TUS except to say that there are these four
    "normalization forms." So it's not wrong to use the term for converting
    text to some other form, even if doing so is non-standard and bad for
    interoperability.

    > NFC and NFD forms are not extremely useful, including for collation,
    > or even for rendering. They only suit the need for compatibility with
    > non-Unicode standards that can't compose/decompose characters
    > themselves.

    NFC may not seem to fit the Unicode model, but far from being "not
    useful," it is required by certain Internet and W3C protocols.

    NFD is really the True Unicode Way, relegated to "normalization form"
    status by the practical reality of precomposed characters.

    -Doug Ewell
     Fullerton, California
     http://users.adelphia.net/~dewell/



    This archive was generated by hypermail 2.1.5 : Mon Jan 17 2005 - 00:12:06 CST