From: Doug Ewell (email@example.com)
Date: Thu Nov 27 2003 - 13:46:18 EST
Arcane Jill wrote (in rich text):
> The review on Ethiopic and Tamil non-decimal digits is interesting,
> but I can't help but feel it was a culturally biased decision (read:
> mistake) to EVER have had a "radix ten" property without any similar
> property for any other radix, thereby forcing non-decimal digits to
> end up being classified as No (Other_Number) instead of Nd
I think the charge of cultural bias is overstated. The *vast* majority
of cultures on Earth use a base-10 positional system.
On the contrary, having a property that associates all the various DIGIT
NINEs, from Latin to Arabic-Indic to Oriya to Lao to Ethiopic to Limbu,
with the numeric value 9 shows a distinct absence of bias toward a
As Mark said, the minority of cultures that use a number system other
than base-10 positional can still have their numbers represented in
Unicode, but software that wishes to interpret numeric values using such
a system must handle it specially. Not every number system can be
neatly encapsulated in UnicodeData.txt.
> It's a mistake because, even in my culture, digit one followed by
> digit two is not always interpretted as the number twelve. Phone
> numbers and PINs are one exception. Version numbers such as "version
> 12.12.12" are another exception.
Those aren't numbers. Ha ha! Surprised? They are *character strings*
that happen to consist (mostly) of digits.
There is *no inherent numeric value* to a phone number, a PIN, a U.S.
ZIP code, or a credit card number. They are just identifiers. You
cannot get any meaningful result by performing arithmetic operations on
them, unless of course you care who was assigned a PIN immediately
before you (but even then, you could do the same thing with alphabetic
identifiers). Phone number assignment is anything but sequential.
In fact, interpreting such an identifier strictly as a number can lead
to problems if leading zeros are dropped. The Boston ZIP code "02101"
cannot be correctly rendered as "2101" even though that is numerically
In a software version number such as "12.12.12", the individual twelves
are decimal numbers, but the full stop between them is not a normal
decimal point. Each component "12" is a number, but the complete
eight-character string is not.
In my culture at least, we even misuse this term "number" to refer to
ALPHANUMERIC identifiers. For example, we speak of the "serial number"
on a dollar bill, a "driver's license number," or a "license plate
number." All of these typically contain one or more letters.
People always look at me like I'm crazy when I say these aren't numbers,
but they aren't.
This archive was generated by hypermail 2.1.5 : Thu Nov 27 2003 - 14:42:22 EST