Re: String name and Character Name

From: Peter Kirk (
Date: Sat Apr 23 2005 - 17:11:38 CST

  • Next message: Dean Snyder: "Re: String name and Character Name"

    On 23/04/2005 21:42, Asmus Freytag wrote:

    > ...
    > But in the spirit of hypothesizing a solution, I would consider using
    > an alias mechanism in the way aliases are used for Property names the
    > best solution. For properties (and their values) there exist multiple
    > aliases, which are all considered unique.
    > This mechanism has been used to fix typos in the name of properties.
    > For example the linebreak property called "inseparable" had been
    > called "inseperable". Instead of changing that name, the correct name
    > has become the preferred alias and the incorrect name has been
    > retained as an alias. (A similar thing was done for an incorrect block
    > name: "Cyrillic_Supplement" instead of the incorrect
    > "Cyrillic_Supplementary"). The benefits of such a solution are:
    > 1) users can use a 'correct' name to refer to a property and don't
    > need to use an 'incorrect' name
    > 2) users are guaranteed that software will continue to understand the
    > old name, as all aliases are considered equivalent descriptions of the
    > property
    > 3) the UTC guarantees that all aliases from the same name space are
    > unique
    > 4) users can rely on that no alias will be retired
    > The current use of aliases for Unicode *character* names does not
    > follow any of these rules. They are merely alternate names that are
    > known to be used by some user community. However, if people other than
    > Peter Kirk consider the current situation in need of a formal
    > solution, then this more formal form of aliasing would be a way
    > forward. ...

    I would support a move towards a more formal solution. But unfortunately
    this solution has the same deficiency which I pointed out earlier today
    with Hans' solution, namely:

    > it does not solve the problem. For many of the character name errors
    > are of the type that a name which has been wrongly given to character
    > A should have been given to character B. And this type of change does
    > not allow the name to be reassigned to character B.


    > How could this be done? One very limited way would be to add to the
    > list of Unicode1.0 character names. That would allow the use of a
    > single alternate formal alias for characters, which should be quite
    > suitable for corrections to the names with obvious errors. These would
    > be printed with special convention (for example all uppercase). The
    > existing use of informal character name aliases (in lower or mixed
    > case) would continue as before.
    > A more extensive approach would be to introduce a full-fledged
    > CharacterNameAliases.txt file, which would not put an arbitrary
    > constraint on the number of aliases. Even in this case, the aliases in
    > the file should be restricted to formal aliases only, which would tend
    > to keep their number between 1 and 2 for almost all characters (the
    > original name being considered an alias as well, the numbers are 1 and
    > 2, rather than 0 and 1).

    I think there may be a significant number of characters for which
    various people might want to propose more than one formal alias in
    addition to the original name. Although I can understand that the UTC
    might want to try to restrict the number of aliases, it might be tying
    their hands unnecessarily tightly to use a mechanism which cannot cope
    with more than one alias per character.

    Peter Kirk (personal) (work)
    No virus found in this outgoing message.
    Checked by AVG Anti-Virus.
    Version: 7.0.308 / Virus Database: 266.10.2 - Release Date: 21/04/2005

    This archive was generated by hypermail 2.1.5 : Sat Apr 23 2005 - 17:19:46 CST