Re: ? Reasonable to propose stability policy on numeric type = decimal

From: Asmus Freytag (
Date: Mon Jul 26 2010 - 13:06:11 CDT

  • Next message: Shriramana Sharma: "VS characters, default ignorable property and text search and collation"

    On 7/26/2010 6:55 AM, John Burger wrote:
    > Mark Davis ☕ wrote:
    >> From just a quick scan, it appears that they are currently all
    >> contiguous within their respective groups. If we were to impose a
    >> stability policy, it would be a constraint on the general_category:
    >> we would not assign general_category=decimal_number to any character
    >> unless it was part of a contiguous range of 10 such characters with
    >> ascending values from 0..9.
    While that is true for the properties, it's not true for the encoding of
    character that are *used* as decimal digits. Martin gave the most widely
    used counterexample.
    > Whether such a policy makes sense, I'm not clear on why it would be
    > called a "stability" policy - the analogy to the existing such
    > policies seems strained at best.
    There are two parts to this.

    One, and I think this is the more important part, is to have an encoding
    policy of not splitting up runs of decimal digits - which would include
    reserving a spot for a zero, in case, *over the lifetime of Unicode*,
    some script changes their use from numbers 1-9 to decimal digits.

    The other is a guarantee of what it means for a character to have the
    decimal digit property.

    My suggestion for handling this, differ a bit from what has been
    discussed so far.

    The first I would address by suitable language in the WG2 Principles and
    Procedures document. This is where policies on encoding are maintained.
    True, these policies do allow exceptions, but exceptions (note Han !) do
    exist, and if a similar case of mixed-use character came along, then
    they would have to be dealt with accordingly. What the P&P would do is
    remove the wrong notion that it is OK to scatter runs of known decimal
    digits when encoding new scripts.

    The second I would address not by a stability policy, but by clarity of
    definition of the property. Language such as:

        "A character is given the decimal digit property, if and only if, it is
         used in a decimal place-value notation and all 10 digits are encoded
         in a single unbroken run starting with the digit of value 0, in
        order of magnitude".

    or equivalent would be quite sufficient. That language happens to be a
    much clearer statement of the *implicit* definition used in assigning
    this property than the language found in UAX#44 or Unicode Section 4.6.

    Having that language where the property is documented is much more
    useful and visible than in a stability policy.


    This archive was generated by hypermail 2.1.5 : Mon Jul 26 2010 - 13:07:58 CDT