Re: ? Reasonable to propose stability policy on numeric type = decimal

From: CE Whitehead (
Date: Tue Jul 27 2010 - 13:52:28 CDT

  • Next message: karl williamson: "Why does EULER CONSTANT not have math property and PLANCK CONSTANT does?"


    > From: Mark Davis ☕ (
    > Date: Mon Jul 26 2010 - 14:13:22 CDT
    > I agree that having it stated at point of use is useful - and we do that in
    > other cases covered by stability clauses; but we can only state it IF we
    > have the corresponding stability policy.

    > Mark

    > . . .
    >> On Mon, Jul 26, 2010 at 11:06, Asmus Freytag <> wrote:

    >>> On 7/26/2010 6:55 AM, John Burger wrote:
    >>> Mark Davis ☕ wrote:
    >>>> From just a quick scan, it appears that they are currently all contiguous
    >>>> within their respective groups. If we were to impose a stability policy, it
    >>>> would be a constraint on the general_category: we would not assign
    >>>> general_category=decimal_number to any character unless it was part of a
    >>>> contiguous range of 10 such characters with ascending values from 0..9.
    >> While that is true for the properties, it's not true for the encoding of
    >> character that are *used* as decimal digits. Martin gave the most widely
    >> used counterexample.
    >>> Whether such a policy makes sense, I'm not clear on why it would be called
    >>> a "stability" policy - the analogy to the existing such policies seems
    >>> strained at best.
    >> There are two parts to this.
    >> One, and I think this is the more important part, is to have an encoding
    >> policy of not splitting up runs of decimal digits - which would include
    >> reserving a spot for a zero, in case, *over the lifetime of Unicode*, some
    >> script changes their use from numbers 1-9 to decimal digits.
    >> The other is a guarantee of what it means for a character to have the
    >> decimal digit property.
    >> My suggestion for handling this, differ a bit from what has been discussed
    >> so far.
    >> The first I would address by suitable language in the WG2 Principles and
    >> Procedures document. This is where policies on encoding are maintained.
    >> True, these policies do allow exceptions, but exceptions (note Han !) do
    >> exist, and if a similar case of mixed-use character came along, then they
    >> would have to be dealt with accordingly. What the P&P would do is remove the
    >> wrong notion that it is OK to scatter runs of known decimal digits when
    >> encoding new scripts.
    >> The second I would address not by a stability policy, but by clarity of
    >> definition of the property. Language such as:
    >> "A character is given the decimal digit property, if and only if, it is
    >> used in a decimal place-value notation and all 10 digits are encoded
    >> in a single unbroken run starting with the digit of value 0, in
    >> ascending
    >> order of magnitude".
    >> or equivalent would be quite sufficient. That language happens to be a much
    >> clearer statement of the *implicit* definition used in assigning this
    >> property than the language found in UAX#44 or Unicode Section 4.6.
    >> Having that language where the property is documented is much more useful
    >> and visible than in a stability policy.
    >> A./
    I like this policy -- both parts of it -- but agree with Asmus that the first thing to do is define a decimal digit; that will rule out the characters such as Asmus has described where
    "> the same [alphabetic] characters
    > are also used as elements in a system that doesn't use place-value, but
    > uses special characters to show powers of 10. "
    (there is no reason for these not to be as contiguous as possible but these cannot be contiguous if they are alphabetic . . .
    and if there is no zero then reserving a space for the zero is a moot issue; also these are all encoded and I think we want the policy for future encodings only)
    there are other cases where characters do not use place value although they seem to be based on 10's 100's etc;
    a number of languages used | for 1 ; || for 2 ; ||| for 3
    or something similar, and then have bundled multiples of 10 (many of these seem to be ancient languages . . . mostly it seems, and certainly there is no 0 and no need to reserve space for it;
    I've not gone through many character charts though so I can't really speak as an expert as you all can; sorry I've not gotten to more; I will try to (I have been looking some at my registries instead; long story).
    C. E. Whitehead


    This archive was generated by hypermail 2.1.5 : Tue Jul 27 2010 - 13:54:49 CDT