From: Kent Karlsson (kent.karlsson14@telia.com)
Date: Tue Nov 30 2010 - 07:36:02 CST
Den 2010-11-29 23:24, skrev "Kenneth Whistler" <kenw@sybase.com>:
...
> they are quite often used in traditional numbering in
> East Asia, which does not use decimal radix forms. Handling
> Han numeric ideographs requires special processing to
> parse numeric values correctly.
CLDR, and ICU, has (some) support for that. See
http://www.unicode.org/cldr/trac/browser/trunk/common/rbnf/zh_Hant.xml
http://www.unicode.org/cldr/trac/browser/trunk/common/rbnf/zh.xml
http://www.unicode.org/cldr/trac/browser/trunk/common/rbnf/ja.xml
The data in these datafiles are used by the RBNF number formatter
and reader APIs in ICU:
http://icu-project.org/apiref/icu4c/classRuleBasedNumberFormat.html.
(None of them permit substituting "numerically equivalent" Han characters
for reading.)
More on numbering systems in CLDR: see
http://www.unicode.org/cldr/trac/browser/trunk/common/supplemental/numbering
Systems.xml. One, just one (for now at least), decimal-base position system
using Han characters is supported, called "hanidec". The names listed in
numberingSystems.xml can be used in the ICU API to ask for the numbering
system in question. (Some of the number spellout systems, including the
Han character ones, can be asked for that way; but most cannot, and one must
then use the RBNF API directly.)
/Kent K
This archive was generated by hypermail 2.1.5 : Tue Nov 30 2010 - 07:41:20 CST