From: Mark Davis ☕ (mark@macchiato.com)
Date: Mon Nov 29 2010 - 16:14:23 CST
The numeric types, including Han, are supplied in:
http://unicode.org/Public/6.0.0/ucd/extracted/DerivedNumericType.txt
http://unicode.org/Public/6.0.0/ucd/extracted/DerivedNumericValues.txt
Mark
*— Il meglio è l’inimico del bene —*
On Mon, Nov 29, 2010 at 13:17, M.-A. Lemburg <mal@egenix.com> wrote:
> Hello,
>
> in Python we have come across a possible inconsistency with respect
> to the way code points are classified as having numeric properties in
> the Unihan database.
>
> I'd like to get information on whether this is intentional or
> just a side-effect of the Unihan database using a different
> approach to number type classification than the UCD.
>
> In the UCD, the number type is defined as:
>
> http://www.unicode.org/reports/tr44/#Numeric_Type
>
> that is there are decimals (= decimal digits) which can be used to parse
> decimal radix digits; digits which represent decimal digits, but require
> special handling (e.g. superscript digits) and numeric types which can
> mean anything from single digits, to fractions and multi-digit numbers.
>
> In Unihan, the number code points are defined using:
>
> http://www.unicode.org/reports/tr44/#Numeric_Type_Han
>
> that is all code points with numeric representations are grouped
> in the numeric type category and there is an additional separation
> by accounting use, primary numeric and other numeric use.
>
> The typically used Chinese and Japanese code points for
> numeric digits fall into the Unihan range:
>
> http://www.wordiq.com/definition/Chinese_numerals
> http://en.wikipedia.org/wiki/Chinese_numerals
>
> Question: Why don't these code points have the "Nd" category ?
>
> See this list for the 5.2.0 group of Nd code points:
>
> http://www.unicode.org/Public/5.2.0/ucd/extracted/DerivedNumericType.txt
>
> Related to this, it is also unclear what to use as official zero
> for these number systems (U+3007 is often recommended).
>
> Finally, unlike many of the other digit code point sequences
> in the UCD, there doesn't appear to be such a sequence for
> Chinese decimal digits (apart from the incomplete vertical variant
> U+3021 - U+3029, which lacks the zero).
>
> Thanks,
> --
> Marc-Andre Lemburg
> eGenix.com
>
> Professional Python Services directly from the Source (#1, Nov 29 2010)
> >>> Python/Zope Consulting and Support ... http://www.egenix.com/
> >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
> >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
> ________________________________________________________________________
>
> ::: Try our new mxODBC.Connect Python Database Interface for free ! ::::
>
>
> eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
> D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
> Registered at Amtsgericht Duesseldorf: HRB 46611
> http://www.egenix.com/company/contact/
>
>
This archive was generated by hypermail 2.1.5 : Mon Nov 29 2010 - 16:16:40 CST