Re: Digit/letter variants in the "same" unified script (was: stability policy on numeric type = decimal)

From: Asmus Freytag (
Date: Thu Jul 29 2010 - 17:45:07 CDT

  • Next message: Kenneth Whistler: "Re: UTS#10 (collation) : French backwards level 2, and word-breakers."

    Having Nd be limited to characters that

    a) are used in decimal radix numbers
    b) are part of a complete, ordered sequence 0..9

    would make this property regular enough to serve
    implementers. You could script the creation of
    relevant data for your implementation based on that

    *Exceptions* exist and need to be documented.
    Having exceptions machine readable is not as
    important, but having implementers understand
    them is.

    Therefore, the best thing is for these to become
    something other than Nd, but to retain their numeric
    type of digit.

    Together with a detailed explanation of each in
    the appropriate script chapter, AND a complete
    summary of all exceptional cases in a central
    place (section 4.6 comes to mind) would provide
    implementers with the information they need.

    The exceptional cases that I'm aware of are

    a) Arabic using two complete series of digits
    b) New Thai Lue using an extra digit 1
    c) Han digits being scattered and used in two
    different types of numeric expressions
    d) ASCII digits being used for some scripts
    as preferred decimal-radix digits, because
    their native number system is not, or not
    exclusively decimal-radix

    The above information belongs in section 4.6
    in summary form, or simply as table of pointers
    to each script chapter that contains a description
    of unusual numeric behavior for decimal-radix

    (A separate table pulling together all the descriptions
    of non-decimal radix number systems that are
    discussed in the Standard would equally be useful
    for the readers).

    This archive was generated by hypermail 2.1.5 : Thu Jul 29 2010 - 17:48:22 CDT