Re: Does Unicode 4.1 change NFC?

From: Kenneth Whistler (kenw@sybase.com)
Date: Mon Apr 04 2005 - 12:02:42 CST

  • Next message: Chris Jacobs: "Re: Macrons"

    > >They're new characters, Philippe. They weren't encoded until 4.1.
    > >

    Peter Kirk continued:

    > In that case these character allocations seem perverse, given that both
    > of these characters could have been assigned to the BMP, or both to
    > outside it

    Perverse it may be, but there is no point in casting implied
    asperversions at the UTC.

    It was perverse of the DPRK standards body to add them to
    PKS C-5700 in the first place.

    It was perverse of the DPRK to insist that they be encoded
    in 10646 in the BMP.

    It was perverse of WG2 to assign them to the BMP.

    But it was not perverse of the UTC to acquiesce in that assignment,
    to guarantee continued synchronization between the standards.

    > It could also be a serious
    > security hole, as hackers try sending U+FACF to various implementations
    > in an attempt to crash them.

    Crying "security hole!" seems to be the Fad Of The Month on the
    Unicode list, but this isn't one of them.

    In any conformant Unicode 4.0.1 (or earlier) version of normalization,
    U+FACF normalizes to (tada!) U+FACF. If it doesn't, the normalizer
    isn't conformant. If sending U+FACF to such a normalizer crashes
    an application, then shame on the programmer.

    In any conformant Unicode 4.1.0 version of normalization, U+FACF
    normalizes to U+2284A. If it doesn't, the normalizer isn't
    conformant. If sending U+FACF to such a normalizer crashes
    an application, then shame on the programmer.

    There is a very good set of normalization test data available for
    both Unicode 4.0.0 and now for Unicode 4.1.0. Anyone who puts
    out an implementation of normalization that cannot pass the
    appropriate version test deserves what they get.

    In neither case is this a security hole *caused* by the allocation.

    --Ken



    This archive was generated by hypermail 2.1.5 : Mon Apr 04 2005 - 12:03:39 CST