UCA tertiary weight assignment vs. decomposition type definition in Unicode character database

From: Matt Ma <matt.ma.umail_at_gmail.com>
Date: Fri, 27 Jan 2012 13:16:28 -0800

Hi,

There are a few characters having no decomposition type defined in
UnicodeData.txt, but they were assigned tertiary weight in
allkeys.text as if the characters had decomposition type. Here are a
few examples (version 6.0.0),

(1) in UnicodeData.txt

    31B4;BOPOMOFO FINAL LETTER P;Lo;0;L;;;;;N;;;;;
    A732;LATIN CAPITAL LETTER AA;Lu;0;L;;;;;N;;;;A733;
    A733;LATIN SMALL LETTER AA;Ll;0;L;;;;;N;;;A732;;A732
    1F1E6;REGIONAL INDICATOR SYMBOL LETTER A;So;0;L;;;;;N;;;;;

(2) in allkesy.txt

    A733 ; [.15A3.0020.0004.A733][.15A3.0020.0004.A733] # LATIN SMALL
LETTER AA; QQKN
    A732 ; [.15A3.0020.000A.A732][.15A3.0020.000A.A732] # LATIN
CAPITAL LETTER AA; QQKN
   1F1E6 ; [.15A3.0020.000A.1F1E6] # REGIONAL INDICATOR SYMBOL LETTER A; QQK
    31B4 ; [.31C9.0020.0019.31B4] # BOPOMOFO FINAL LETTER P; QQK

U+A733, U+A732, U+1F1E6 were given tertiary weight as they were
<compat>, while U+31B4 as it were <final>.

Is this something documented outside of UCA?

Thanks,
Matt
Received on Fri Jan 27 2012 - 15:26:36 CST

This archive was generated by hypermail 2.2.0 : Fri Jan 27 2012 - 15:26:38 CST