Re: ? Reasonable to propose stability policy on numeric type = decimal

From: vanisaac@boil.afraid.org
Date: Tue Jul 27 2010 - 16:03:26 CDT

  • Next message: Christoph Päper: "Re: Pashto yeh characters"

    From: Kenneth Whistler (kenw@sybase.com)
    --------------------------------------------------------------------------------
    > C. E. Whitehead said:
    >
    > > I've not gone through many character charts though so I can't
    > > really speak as an expert as you all can; sorry I've not gotten
    > > to more; I will try to ...
    >
    > For people who wish to pursue this issue further, the relevant
    > information is neatly summarized in the extracted property
    > data file:
    >
    > http://www.unicode.org/Public/UNIDATA/extracted/DerivedNumericType.txt
    >
    > That is what you should look at for efficiency, and
    > is basically what the UTC would be using for discussion
    > about this matter.
    >
    > --Ken

    C.E.

    Specifically, notice New Tai Lue numbers (U+19D0-U+19DA). We have a sequence of eleven gc=Nd, that absolutely cannot be arranged so that consecutive code points have ascending numeric values. I doubt that if Arabic were encoded today that there would be a full set of Eastern digits, only 4-7, with 0-3 and 8 & 9 sharing with the regular Arabic digits. This leads me to the conclusion that any formal policy is inviting definitionally insoluble problems in future encodings - collision between encoding each character only once, and having a mathematically pure digit sequence.

    That having been said, I have absolutely no problem with reserving a code point for zero, especially when a script is still in current use by a modern language community. Even if usage has not been place-value before, it is a simple adaptation for a script when its user community is exposed to global business, scientific, and standards communities.

    Even though I have no official say, as a script encoder, my vote would be to simply recommend that decimal digits be sequentially ordered 0-9, and to leave a reserved code point if the system is in modern use but does not currently use place-value, and hence have a digit zero. I would explicitly fight against anything more formal, as it would unnecessarily encumber script encoders who have to balance a lot more interests than just programmers who won't provide for an exception branch for non-sequential number arrangements. You've gotta do it anyway, for CJK and New Tai Lue. I would also question any programmer who wouldn't allow for mixing of the two blocks of Arabit digits. Just leave the code open for future additions, just as you do for the sequential/ascending numbers.

    -Van Anderson



    This archive was generated by hypermail 2.1.5 : Tue Jul 27 2010 - 16:06:24 CDT