Re: String name and Character Name

From: Otto Stolz (Otto.Stolz@uni-konstanz.de)
Date: Mon Apr 25 2005 - 05:28:47 CST

  • Next message: Arcane Jill: "Fonts, glyphs and infinite Unicode (was String name etc)"

    Hello Peter Kirk,

    I had written:
    > Have you ever read Section C.6 of TUS
    > <http://www.unicode.org/versions/Unicode4.0.0/appC.pdf>?

    This was meant as a pointer to the sentence quoted below.
    It was not meant to hurt your feelings, and I apologize if it did so.

    Quote from TUS C.6:
    > In the ISO/IEC framework, the unique character name is viewed as the
    > major resource for both character semantics and cross-mapping among
    > standards.

    Quote from <http://www.unicode.org/versions/Unicode4.0.0/ch16.pdf>:
    > [...] Character names are unique and stable. [...]

    I think this is the central statement about which the most part of
    this thred hinges.

    You have written:
    > I would expect you to
    > understand that nowhere in the above quotation is there even the
    > slightest suggestion that "the intended purpose of the nameslist [is
    > only] providing an unique and immutable identifier",

    I think, the quote above says exactly that the intended purpose of
    the nameslist is providing a unique and immutable identifier.

    So, it's probably the emphasis on the "only" purpose of the character
    names you are missing? But it goes without saying, that this rule
    excludes every usage not compatible with it. As we have seen in this
    thread, the character names (without the aliases, cf. infra) are not
    apt for a user interface, just because they are immutable.

    You have continued:
    > [I would expect you to understand that nowhere in the above quotation
    > is there even the slightest suggestion that the intended purpose of
    > the nameslist ] "does not explicitly include the task of supporting
    > users in identifying characters".

    Do you really expect a standard explitely stating what it "does not
    explitely include"? I deem this a logical impossibility akin to the
    Epimenides paradox.

    You also have observed:
    > Elsewhere this section does state:
    > > the formal character names may differ in unexpected ways from
    > > commonly used names
    and assess this statement thusly:
    > but fails to draw the obvious conclusion, and the one accepted by the
    > UTC that according to Asmus, that formal character names should not be
    > considered to have any significance except in that they are unique and
    > immutable.

    Actually, this quote from the subsection "Aliases" unabridged reads thus:
    > Because the formal character names may differ in unexpected ways from
    > commonly used names (for example, PILCROW SIGN = paragraph sign), some
    > aliases may be useful alternate choices for indicating characters in
    > user interfaces.

    Now, here we have TUS mentioning character names in user interfaces.
    Thank you from pointing me there :-)

    As "Aliases are informative and may be updated" (from the same para-
    graf), here is the hint for everybody planning to use the Unicode
    character names in a user interface: Just include the aliases, and
    report any inaccuracies and omissions in the aliases via the official
    channel <http://www.unicode.org/reporting.html> to have it mended in
    TUS. (Note also the existence of <http://www.unicode.org/errata/>
    for any urgent cases.)

    You have focussed on different sentences, in the very same text, than
    I did, hence we arrive at different conclusions.

    You have written:
    > But there is a problem in that this decision of the UTC has not been put
    > into proper effect even within the text of the Unicode standard itself,
    > in which there are a huge number of cases of a Unicode character name
    > being given semantic significance. For an example taken almost at
    > random, I quote the following from section 16.1, p.415:
    > > When a case mapping corresponds solely to a difference based on SMALL
    > > versus CAPITAL in the names of the characters, the case mapping is not
    > > given in the names list but only in the Unicode Character Database.
    and I complete your partial quote with:
    > When the case mapping cannot be predicted from the name, the information
    > is given in a note.

    You have continued:
    > In other words, case mappings depend on character names, in breach of
    > the principle that "the intended purpose of the nameslist [is only]
    > providing an unique and immutable identifier".

    Again a wrong conclusion. The case mapping does not depend on the
    character names; rather, the character names are exploited to simplify
    the presentation in TUS, chapter 16.

    Remember that this quote is from the section 16.1 "Character Names List".
    It simply describes the convention for that list: how it is written and
    how it is meant to be read. Hence the above sentence simply says: "If the
    case mapping is obvious, we will not clutter this list with it; however,
    if it is not obvious, we will mention it in an annotation. In any case,
    you can look at the Unicode Character Database, which is comprehensive,
    in this respect."

    Best wishes,
        Otto Stolz



    This archive was generated by hypermail 2.1.5 : Mon Apr 25 2005 - 05:32:13 CST