Re: PRI #202: Extensions to NameAliases.txt for Unicode 6.1.0

From: Philippe Verdy <verdy_p_at_wanadoo.fr>
Date: Thu, 1 Sep 2011 08:25:44 +0200

2011/9/1 Karl Williamson <public_at_khwilliamson.com>:
> Unicode 6.0 broke UTS #18, which since 1999 has suggested that BELL be the
> name used in regular expressions for U+0007.  In 2003, this was strengthened
> to "should" be used.  The breakage occurred by requiring that BELL instead
> be the name for a different code point.  By breaking UTS #18, all
> implementations of it, including Perl's, were broken, causing real harm to
> real code and real people.  For this reason, Perl has not completely adopted
> 6.0.
>
> Further, UTS #18 encourages implementations to do exactly what Perl did:
> "The ISO names for the control characters may be unfamiliar, ... so it is
> recommended that they be supplemented with other aliases. For example, for
> U+0009 the implementation could accept the official name CHARACTER
> TABULATION, and also the aliases HORIZONTAL TABULATION, HT, and TAB."

Thanks then for explaining that. So now such aliases are needed to
correct obvious errors. Well this is not a real correction, but a
change for the new short name. The "should" that specified an alias
will now be replaced by a "must" with the new alias. This instability
should have been explained, as it was not explicit in the PRI.

> The genesis of this proposal was to prevent the Unicode Consortium from
> making this kind of mistake again.  The language in UTS #18 mentioning the
> TAB variants also dates to 2003.  I think this example makes it clear why
> more than one alias may be needed per code point.
>
> Of course, PRI #202 is not the only mechanism possible to achieve the needed
> goal of preventing another mishap like BELL.  But the consensus in the
> discussion about it was that is was the easiest route to get there.

Given that most of these discussions have occured offline (not
publicly but in private reports to the UTC, or between UTC members
working with the closed unicore discussion list, or privately between
each others), I had no idea of these discussions. But now that I'm an
UTC member, I hope I will hear these cases earlier...

Does it justify so many new aliases at the same time ?

I've not checked the history of all past versions of UAX, UTR, and UTN
(or even in the text of chapters of the main UTS)... Are there other
cases in those past versions, that this PRI should investigate and
track back ?
Received on Thu Sep 01 2011 - 01:31:20 CDT

This archive was generated by hypermail 2.2.0 : Thu Sep 01 2011 - 01:31:31 CDT