Error in default collation element computation (?)

From: Bernard Desgraupes (bdesgraupes@easyconnect.fr)
Date: Thu Jan 27 2005 - 08:01:01 CST

  • Next message: Marcin 'Qrczak' Kowalczyk: "Re: Open Issue #61: Proposed Update UAX #15 Unicode Normalization Forms"

    Forgive me if this has already been reported or if I'm just
    misunderstanding, but I think there is a mistake in the description of
    the algorithm to compute a default collation element for characters
    with compatibility decompositions.

    This is in paragraph 7.3 (Compatibility Decompositions). It says the
    foloowing (under point 3.) :
    =============
    3. Set the first two L3 values to be lookup(L3), where the lookup uses
    the table in §7.3.1 Tertiary Weight Table. Set the remaining L3 values
    to MAX (which in the default table is 001F):
    0028 [*023D.0020.0004] % LEFT PARENTHESIS
    0032 [.06C8.0020.001F] % DIGIT TWO
    0029 [*023E.0020.001F] % RIGHT PARENTHESIS
    ==============

    In that case, the level 3 weight for character 0032 should 0004 instead
    of 001F. So we should have:
    0032 [.06C8.0020.0004] % DIGIT TWO

    This is corroborated by the already computed value found in allkeys.txt :

    2475 ;
    [*0288.0020.0004.2475][.0E2B.0020.0004.2475][*0289.0020.001F.2475] #
    PARENTHESIZED DIGIT TWO; QQKN

    Since I did not see any correction in the beta release of 4.1.0, I
    thought I'd mention it ( I know UTR10 is just a TR, not an annex but
    anyway).

    Cheers

    Bernard

     



    This archive was generated by hypermail 2.1.5 : Thu Jan 27 2005 - 08:01:42 CST