Re: Code Point -- What is the integer?

From: Jukka K. Korpela (jkorpela@cs.tut.fi)
Date: Fri Apr 29 2005 - 04:28:26 CST

  • Next message: Hans Aberg: "Re: Code Point -- What is the integer?"

    On Fri, 29 Apr 2005, Hans Aberg wrote:

    > The things I would have done somewhat differently, as a
    > mathematician, is to develop it around a group of separate concepts,
    > then linking them together, rather than throwing the different pieces
    > altogether in one lump.

    Having a mathematical background, I have somewhat similar thoughts on the
    character concept in Unicode. But we must remember that Unicode tries to
    cover issues of human behavior and understanding, which are
    ("unfortunately", some people might add) not quite rigorously
    formalizable.

    > For example, I would no have use the word "character" everywhere, and
    > used the word "set" for a collection of something, rather than
    > different words like "repertoire".

    The word "set" was already in use for an ordered and often coded
    collection.

    > So, "abstract character set" seems
    > better than "abstract character repertoire"

    In some context maybe, but my main problem now, with the definitions, is
    the multitude of ways in which the word "character" and the expression
    "abstract character" are used. Moreover, does "abstract character
    repertoire" parse as repertoire of abstract characters or as a character
    repertoire that is abstract?

    Here's what the Unicode standard itself says in its glossary:
    It describes the term "character" in different meanings.
    The first one is: "The smallest component of written language that has
    semantic value; refers to the abstract meaning and/or shape, rather than a
    specific shape (see also glyph), though in code tables some form of visual
    representation is essential for the reader's understanding." The second
    meaning is that "character" is synonym for "abstract character". which is
    defined as "a unit of information used for the organization, control, or
    representation of textual data".

    The most obvious difference between character and abstract character seems
    to be that an abstract character could be a control function (say, newline
    or ESC), whereas a character is what many people call a graphic character
    or a printable character. But I don't think such a distinction is drawn
    systematicallyt

    -- 
    Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/
    


    This archive was generated by hypermail 2.1.5 : Fri Apr 29 2005 - 04:30:13 CST