# Non-decimal positional digits; was: Defined Private Use

From: Ernest Cline (ernestcline@mindspring.com)
Date: Wed Apr 28 2004 - 10:32:14 EDT

• Next message: Peter Constable: "RE: Croatian"

> [Original Message]
>
> Ernest Cline <ernestcline at mindspring dot com> wrote:
>
> > TENGWAR DUODECIMAL DIGITS TEN and ELEVEN
> > present an interesting problem. They are digits, but not
> > decimal digits. Should the concept of General Category
> > Nd be expanded to include non-decimal number systems?
>
> No, the "d" stands for Decimal. This category is deliberately limited
> to characters that can be concatenated to form numbers in a base-10
> positional number system. It's a fact of life that base-12 and base-16
> digits are relegated to category No.

I recognized that limitation with my choice of words. The fact is
that at present Unicode does not encode any non-decimal digits,
(The hexdigits [a-f][A-F] don't count because they aren't used
exclusively as digits, and don't have any numeric values (aside
from the ever not-so helpful Hex_Digit and ASCII_Hex_Digit)
assigned to them. Given that fact, it might make sense to provide
a way for Unicode to indicate non-decimal positional digits.
Extending Nd was one possibility I considered, another was to use
the triple number.

> > Or would
> > E06A;TENGWAR DIGIT TEN;Nl;0;L;;10;10;10;N;;;;;
> > be sufficient?
>
> I think the General Category has to be No rather than Nl. Very few
> characters are of type Nl -- just the Roman numerals, "Hangzhou"
> numerals and Ideographic Zero, and Runic and Gothic letter-numbers.
> Tengwar duodecimal digits aren't letters that got pressed into service
> as numbers, they're just digits that happen to be base-12.

Given that the zero and one (at least in the Tengwar draft I looked at,
there are several at present) were letters and that the example given
by Runic and Gothic is that when some numbers are letters and the
rest are not, the extra numbers get the Nl category instead of No,
I would think that Nl would be appropriate for Tengwar. If Tengwar
were altered to disunify zero and one from the same shaped letters,
then No would indeed be appropriate

> Also, of the three "10" values, you need to remove the first -- it's
> only valid for characters with the decimal digit property (see
> http://www.unicode.org/Public/UNIDATA/UCD.html for more details).

As I said above, Unicode does not currently have any digits used
with a non-decimal positional system. Extending the use of either
category Nd, or of the triple number in UnicodeData.txt seemed
to be appropriate ways to do so. Looking over the data files again,
I see that there exist Nd's without the triple number that are not
intended to be used in positional systems, so I think that if Unicode
chooses to provide a mechanism to specify non-decimal positional
digits, using the triple number is probably the best approach.

This archive was generated by hypermail 2.1.5 : Wed Apr 28 2004 - 11:28:57 EDT