L2/06-309R Source: Kent Karlsson Date: November 7, 2006 Please update my previous report (L2/06-309 Bug in DerivedNumericValues.txt) on numeric value for F9B2 with the following more extensive report. ====================================================================== The following two characters are not given numeric values in DerivedNumericValues.txt: 6F06; ?; 7 CJK UNIFIED IDEOGRAPH-6F06 jp-obsolete-financial 9621; ?; 1000 CJK UNIFIED IDEOGRAPH-9621 jp-obsolete-financial (both according to http://en.wikipedia.org/wiki/Japanese_numerals#Formal_numbers) Wikipedia gives numeric values to these two CJK characters. I have no independent confirmation though. Wikipedia also gives very large (over one trillion) numeric values to certain CJK characters. At least some of these may be interesting to give numeric values to in DerivedNumericValues.txt (derived from Unihan.txt, so the possible omission is there), especially since some of them are also used as translations for SI prefixes. The following eight characters are each canonically equivalent with a CJK character that is given a numeric value in DerivedNumericValues.txt, but these canonical equivalents are not given any numeric value neither in UnicodeData.txt nor in DerivedNumericValues.txt. I think canonically equivalent strings should carry the same numeric values, not just informally but also formally in the Unicode database, so the following eight characters should be given appropriate numeric values in UnicodeData.txt and consequently in DerivedNumericValues.txt. F96B;CJK COMPATIBILITY IDEOGRAPH-F96B;Lo;0;L;53C3;;;;N;;;;; 3 F973;CJK COMPATIBILITY IDEOGRAPH-F973;Lo;0;L;62FE;;;;N;;;;; 10 F978;CJK COMPATIBILITY IDEOGRAPH-F978;Lo;0;L;5169;;;;N;;;;; 2 F9B2;CJK COMPATIBILITY IDEOGRAPH-F9B2;Lo;0;L;96F6;;;;N;;;;; 0 F9D1;CJK COMPATIBILITY IDEOGRAPH-F9D1;Lo;0;L;516D;;;;N;;;;; 6 F9D3;CJK COMPATIBILITY IDEOGRAPH-F9D3;Lo;0;L;9678;;;;N;;;;; 6 F9FD;CJK COMPATIBILITY IDEOGRAPH-F9FD;Lo;0;L;4EC0;;;;N;;;;; 10 2F890;CJK COMPATIBILITY IDEOGRAPH-2F890;Lo;0;L;5EFE;;;;N;;;;; 9 The CJK telegraph symbols should also be given numeric values in UnicodeData.txt as well as DerivedNumericValues.txt, as e.g. parenthesized digits/numbers are: * for months (32C0-32CB, have compat decomps begining with digits ending with 6708): values 1-12 * for days (33E0-33FE, have compat decomps begining with digits ending with 65E5): values 1-31 * for hours (3358-3370, have compat decomps begining with digits ending with 70B9): values 0-24 ===================================================================