Re: PRI #202: Extensions to NameAliases.txt for Unicode 6.1.0

From: Asmus Freytag <>
Date: Fri, 26 Aug 2011 21:03:51 -0700

I agree with Ken that Phillipe's suggestion of conflating the
annotations for mathematical use with formal Unicode name aliases is a
non-starter. The former exist to help mathematicians identify symbols in
Unicode, when they know their name from entity lists. The latter are
designed to allow programmers to support identifiers that match existing
usage -- mainly for characters for which there currently is not any well
defined ID, or for characters for which their abbreviated name is their
de-facto name.

In a limited number of cases, that would lead to multiple aliases for
the same character. The ideal is, as always, to have single identifiers
per character, where possible. In a few exceptional cases, allowing
alternate IDs via the NameAlias technique is of such overwhelming
practical use to support an exception.

Aliases come from the same namespace as character names, and must be
unique, so that they can be used to unambiguously identify a character.
They are intended to be used in programmatic interfaces, for example
regular expressions. Adding redundant identifiers comes at a cost: all
implementations have to rev their name tables, and using recently added
aliases might not be portable until all implementations have caught up.
That's why proposals to add additional aliases to any *existing*
character should have to pass a really high bar. (I find the rationale
for this initial expansion well thought ought and defensible - leaving
the control codes unnamed in 10646 has proven problematic to implementers).

There's no strict limit to *informative* aliases for characters, nor is
there a uniqueness requirement. If there are important real world
designations under which certain characters are known, they could be
documented with informative aliases. These informative aliases are then
available to user interface designers who wish to support a "search for
character by name" feature. Unlike the case for program source code,
such interfaces can handle multiple "hits" for the same name - by
presenting a list, for example.

Utlimately, even in this case, some annotations are better presented in
special purpose files than informative records in the nameslist. That
was done for mathematics. If there are other fields where there were
established conventions for naming symbols, perhaps someone could
provide an analogous list - but it should have no bearing on the PRI
under consideration.

Received on Fri Aug 26 2011 - 23:07:39 CDT

This archive was generated by hypermail 2.2.0 : Fri Aug 26 2011 - 23:07:40 CDT