Re: String name and Character Name

From: Hans Aberg (
Date: Sat Apr 23 2005 - 07:01:47 CST

  • Next message: Hans Aberg: "Re: String name and Character Name"

    I would suggest that the Unicode adds a formal definition of the
    notion of "character name" somehow along these lines:

    Firstly, I do not see there is a need to change the word "name" to
    say "identifier", as "name" is frequently used to indicate a form of
    identifier. For example, one often speaks about a "file name", not a
    "file identifier"; the latter would probably be used when the
    identifier is a number or something.

    Second, character names only apply to abstract characters. The so
    called "surrogate code points" are not to be regarded as abstract
    characters. The private characters are clearly not part of any
    Unicode standard, but the standard could indicate that it is wise to
    not give them names conflicting with Unicode standard names. One
    might set up an Internet service, to register private character
    names. It is considerably easier, to humans, to register new
    character names than code points.

    Then a "character name" is defined to be an identifier that logically
    uniquely identifies the character. Its intent is to be human
    readable, and helping the human to identify the abstract character.
     From the formal logical point, one defines a metaset of metasymbols
    A-Z plus space. The use of "meta" here indicates that these
    characters are not to be confused with any Unicode abstract
    character; in the notion of a logical theory, these metacharacters
    are primitives. A formal character name is a finite sequence from
    this metaset, not starting or ending with space, and not containing
    two adjacent spaces. One can note that with this definition, any
    Unicode text can be represented using these metacharacters, as
    Unicode characters can be represented by Unicode character names, and
    separated by a double space, in view of that no character name can
    contain a double space.

    The efforts will then focus on giving each abstract character a fair
    character name. The effort of finding a linguistically correct
    descriptions will be deflected elsewhere.

       Hans Aberg

    This archive was generated by hypermail 2.1.5 : Sat Apr 23 2005 - 07:03:21 CST