From: CE Whitehead (email@example.com)
Date: Tue Jul 27 2010 - 13:52:28 CDT
> From: Mark Davis ☕ (firstname.lastname@example.org)
> Date: Mon Jul 26 2010 - 14:13:22 CDT
> I agree that having it stated at point of use is useful - and we do that in
> other cases covered by stability clauses; but we can only state it IF we
> have the corresponding stability policy.
> . . .
>> On Mon, Jul 26, 2010 at 11:06, Asmus Freytag <email@example.com> wrote:
>>> On 7/26/2010 6:55 AM, John Burger wrote:
>>> Mark Davis ☕ wrote:
>>>> From just a quick scan, it appears that they are currently all contiguous
>>>> within their respective groups. If we were to impose a stability policy, it
>>>> would be a constraint on the general_category: we would not assign
>>>> general_category=decimal_number to any character unless it was part of a
>>>> contiguous range of 10 such characters with ascending values from 0..9.
>> While that is true for the properties, it's not true for the encoding of
>> character that are *used* as decimal digits. Martin gave the most widely
>> used counterexample.
>>> Whether such a policy makes sense, I'm not clear on why it would be called
>>> a "stability" policy - the analogy to the existing such policies seems
>>> strained at best.
>> There are two parts to this.
>> One, and I think this is the more important part, is to have an encoding
>> policy of not splitting up runs of decimal digits - which would include
>> reserving a spot for a zero, in case, *over the lifetime of Unicode*, some
>> script changes their use from numbers 1-9 to decimal digits.
>> The other is a guarantee of what it means for a character to have the
>> decimal digit property.
>> My suggestion for handling this, differ a bit from what has been discussed
>> so far.
>> The first I would address by suitable language in the WG2 Principles and
>> Procedures document. This is where policies on encoding are maintained.
>> True, these policies do allow exceptions, but exceptions (note Han !) do
>> exist, and if a similar case of mixed-use character came along, then they
>> would have to be dealt with accordingly. What the P&P would do is remove the
>> wrong notion that it is OK to scatter runs of known decimal digits when
>> encoding new scripts.
>> The second I would address not by a stability policy, but by clarity of
>> definition of the property. Language such as:
>> "A character is given the decimal digit property, if and only if, it is
>> used in a decimal place-value notation and all 10 digits are encoded
>> in a single unbroken run starting with the digit of value 0, in
>> order of magnitude".
>> or equivalent would be quite sufficient. That language happens to be a much
>> clearer statement of the *implicit* definition used in assigning this
>> property than the language found in UAX#44 or Unicode Section 4.6.
>> Having that language where the property is documented is much more useful
>> and visible than in a stability policy.
I like this policy -- both parts of it -- but agree with Asmus that the first thing to do is define a decimal digit; that will rule out the characters such as Asmus has described where
"> the same [alphabetic] characters
> are also used as elements in a system that doesn't use place-value, but
> uses special characters to show powers of 10. "
(there is no reason for these not to be as contiguous as possible but these cannot be contiguous if they are alphabetic . . .
and if there is no zero then reserving a space for the zero is a moot issue; also these are all encoded and I think we want the policy for future encodings only)
there are other cases where characters do not use place value although they seem to be based on 10's 100's etc;
a number of languages used | for 1 ; || for 2 ; ||| for 3
or something similar, and then have bundled multiples of 10 (many of these seem to be ancient languages . . . mostly it seems, and certainly there is no 0 and no need to reserve space for it;
I've not gone through many character charts though so I can't really speak as an expert as you all can; sorry I've not gotten to more; I will try to (I have been looking some at my registries instead; long story).
C. E. Whitehead
This archive was generated by hypermail 2.1.5 : Tue Jul 27 2010 - 13:54:49 CDT