Re: Proposed new characters updated in Pipeline Table

From: Ken Whistler <kenw_at_sybase.com>
Date: Mon, 15 Aug 2011 12:38:10 -0700

On 8/15/2011 10:38 AM, Philippe Verdy wrote:
>>> Unicode cannot encode a combining Wasla (because of various stability
>>> >> policies), so if Syriac needs a Wasla to be shown only over a letter
>>> >> or two, one needs to propose precomposed characters for them. Just
>>> >> like the existing Arabic Alef-Wasla.
> Why not? If the character is new,

Occasionally, it would help if you actually did some research before heading
off on these tangents. It is easy to determine (with the use of
DerivedAge.txt)
that the character Roozbeh is talking about is *not* new, but dates all the
way back to Unicode 1.1.

> it can perfectly be encoded with
> whatever character property is needed, including with a non-zero
> combining class, if it fits. The stability policy aboud combining
> sequences is only for sequences of characters that are already encoded
> and for which decomposition mappings (and the related standard
> normalizations), as well as basic case mappings in the UCD cannot be
> modified.

Which applies in this case. If a combining Arabic wasla were to be encoded,
it would create an alternate representation for the existing (and old)
U+0671 ARABIC
LETTER ALEF WASLA. That would break normalization stability, unless
an explicit claim were made that <alef + combining wasla> is not the same
as U+0671, which in turn would defeat the whole point of having the
combining
wasla encoded.

> The stability policy does not concern currently unassigned code points

It does, as for the case just mentioned.

> (except possibly a few ones: the directionality of all code points
> within some designated RTL blocks, should they be currently assigned
> to characters or not;

That constrains the allocation of blocks of new right-to-left scripts,
but it
does not absolutely prohibit the encoding of a non-RTL character within
such a block, if the case is made. For example, a combining mark may occur
in such a script, and its Bidi_Class will end up NSM, not one of the strong
right-to-left values.

> and the reservation of all assigned and
> unassigned code points in the few blocks allocated only for combining
> characters).
This is also not a stability policy claim. It is likely to remain the
case, because
the character encoding committees don't do random things. But there is
no stability policy which would prevent a non-combining mark from ending
up in such a block.

Please do your homework before making claims like this which misinform
the list.

--Ken

>
Received on Mon Aug 15 2011 - 14:39:42 CDT

This archive was generated by hypermail 2.2.0 : Mon Aug 15 2011 - 14:39:43 CDT