RE: Unicode Bloopers

From: Hans Aberg (
Date: Wed Apr 20 2005 - 12:00:19 CST

  • Next message: Patrick Andries: "Re: Subscript omega used for rounded fricatives ?"

    At 08:18 -0700 2005/04/20, Peter Constable wrote:
    >I know of entire ranges of characters (one of them quite large) that
    >some would like to reckon this way. It wouldn't be appropriate to
    >actually list them, however.

    One such range are the mathematical monospace characters. The idea
    was to use them in computer code. The error here is the that of not
    analyzing the underlying semantics sufficiently. In math, changing
    styles usually changes the math semantics, because it is used to
    indicate different logical objects. For example, "sin" in plain or
    boldface would mean different things, as opposed to say the natural
    language English, where the semantics is the same word "sin". Now, in
    computer code, the semantics does not change either, styles are only
    used to make the code more legible. For example, a computer language
    would not accept "sin" in both plain and boldface, and assign
    different meanings to the two words. In addition, the monospace style
    is simply a style that somehow became common in some quarters when
    writing computer code. Perhaps one wanted to get the code nice
    aligned in columns, or emulate teletypes. But in several computer
    science books, monospace is not at all used in order to indicate
    computer code.

    Now, we discussed through this, I recall, in 2002, or something. We
    then noted that these characters were in fact added in error, but
    that they could not be deleted, in view of Unicode's stability
    requirement. Nobody felt at that time it was controversial. Their
    addition do not disturb Unicode as whole -- if you do not like them,
    simply do not use them. On the other hand, now that these are
    available, somebody might find a correct, semantic use for these
    monospace characters.

    So, quite on the contrary, I think it is good that these bloopers are
    listed. Then, one day, if somebody wants to design linguistically
    more accurate character sets, these lists can of use. The main
    purpose of the current Unicode character set is to provide them so
    that most human written text can be semantically represented in a
    computer, and I do not see that these bloopers are affecting that
    intent. Finding linguistically correct character sets, it seems me,
    is a much harder problem, much beyond that more limited scope that
    Unicode apparently has with its current character set.

       Hans Aberg

    This archive was generated by hypermail 2.1.5 : Wed Apr 20 2005 - 12:01:23 CST