Re: Naming of functional ASCII characters in Unicode

From: Otto Stolz (Otto.Stolz@uni-konstanz.de)
Date: Thu Jun 08 2000 - 18:41:20 EDT


On Wed, 7 Jun 2000 12:34:49 -0800 (GMT-0800), Bernd Warken wrote:
> In programming and old-style databases the used characters stand for
> syntax and are not meant for giving a nice look when printed, while
> type-setting has the intention to add aesthetical value to the information.

>From the examples presented below, I draw a different conclusion:
for lack of code-space, ASCII has unified look-alike characters,
while larger character sets (as used in typesetting, and lately
in general computing) can afford to keep these characters distinct.

> This leads to using different characters for a prime, a single quote, an
> accent aigu, an apostrophe, an syntactical character in programming, etc.
> - altho a single character from the ASCII chart could do the trick for all
> of these applications. Instead Unicode created different characters for
> all of these meanings, but the syntactical task cannot be moved to some
> other character code.

Quite to the contrary: there used to be different characters for different
functions, such as opening vs. closing quotes, both in handwriting and
printing. The typewriter technology could not cope with that many different
characters, so several similar looking (yet functionally different)
characters were unified in a common glyph, e. g. an opening-or-closing
quote glyph. (I have even used a typwriter lacking the digits "0" and "1";
you had to use the letters "O" and "l" instead.) Later, ASCII has adopted
a popular typeriter-based character set, and the programming languages
had to cope with it. Unicode has, in a sense, revived the original
functional distinctions that had been hidden by the typewriter and ASCII.

> So the main task of the ASCII character is functional for programming
> purposes, not for pretty-printing.

So is the main task of the richer character set used in typography.
There is no principal difference, only a different size of the code-space;
an ASCII 61 can represent the most beautiful, aesthetically appealing, "a",
optimally kerned w/r to its neighbourhood -- or a Fortran real-type variable.

On the other hand, usage of ASCII in programming has not adopted all of
the ASCII unifications; e. g. at least one programming language, viz.
Algol 60, has different tokens for opening and closing quotes (in ASCII
representation, the trigraphs '(' and ')' ).

> So it's a pity the naming and the glyph does not reflect [the function
> for programming purposes].

Anyway, this would be impossible, as various programming languages assign
different functions to the same ASCII characters. E. g., both ASCII 22
and 27 are poular for string quotes; ASCII 23, 25, 2D, 2F, 7B, and various
other representations (digraphs, trigraphs, keywords) are used as comment
delimiters; and so forth.

Best wishes,
   Otto Stolz



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:03 EDT