Re: U+2047 double question mark collation

From: Kenneth Whistler (kenw@sybase.com)
Date: Wed Jan 15 2003 - 15:53:53 EST

  • Next message: David J. Perry: "RE: Small Latin Letter m with Macron"

    Vadim,

    > I have a problem with creating collation key for U+2047 (double question
    > mark).
    >
    > Explicit collation keys for this symbol is absent in allkeys.txt.

    allkeys.txt in the current version of the Unicode Collation Algorithm
    is based on the Unicode *3.1* repertoire. This can be seen in
    the references section in UTS #10, where the version is explicitly
    listed as allkeys-3.1.1.txt.

    U+2047 is a character added to Unicode Version *3.2*.

    >
    > In UnicodeData.txt this symbol have compatibility decomposition map.
    >
    > 2047: ... :<compat> 003F 003F: ...

    True.

    >
    > Based on this and as defined in UTR #10 Unicode Collation Algoriphm this
    > symbol must have these collation keys:
    >
    > 003F [*024E.0020.0004]
    > 003F [*024E.0020.0004]
    >
    > But in CollationTest_NON_IGNORABLE.txt assumes that symbol have implicit
    > collation key [FBC0.0020.0002] [A047.0000.0000].

    CollationTest_NON_IGNORABLE.txt is also based on the Unicode 3.1
    repertoire. For a Unicode 3.1 implementation of collation,
    U+2047 is a reserved code point.

    This situation, where the allkeys.txt table is slightly out-of-synch
    (behind) the ongoing repertoire additions to the Unicode Standard,
    is a known problem we are working on.

    The Unicode Technical Committee has mandated that the repertoire
    for the allkeys.txt table be updated directly to the Unicode 4.0
    repertoire, as soon after the release of Unicode 4.0 as
    possible. We are trying to do this more or less simultaneously
    this time, but there may be a small delay, given the scope
    of the upcoming Unicode 4.0 release.

    In the meantime, if you need to deal with character additions
    for Unicode 3.2 for collation, then you need to handle them
    in terms of tailorings from the current allkeys.txt table.

    --Ken



    This archive was generated by hypermail 2.1.5 : Wed Jan 15 2003 - 16:41:15 EST