Re: Chapter on character sets

From: Edward Cherlin (
Date: Sat Jun 24 2000 - 13:20:16 EDT

[sigh] I am reminded of a joke. The correct English word for a person
who speaks more than one language is "polyglot", from the Greek
"poly-" (many) and "glot-" (tongue). The correct term for a person
who speaks only one language is "American".

At 1:48 PM -0800 6/20/00, wrote:
>On 06/19/2000 05:04:30 PM <> wrote:
> >- In section 1.3.4., you state that "most of the world only need the lower
> >characters". Saying "much of the world" is more defensible.
>Indeed, "most" would be blatently wrong.
>Peter Constable

More than half of the world's population uses languages written in
Han characters with assorted extensions or in one or more of the ~30
non-Roman alphabetic or syllabic writing systems included in Unicode
3.0. (There are other writing systems in current use, of course.) Of
those using the Roman alphabet with extensions (i/j, u/v, w, eth,
thorn, accented letters, punctuation, Latin Extended A and B,...) few
if any can be fully rendered using ISO-8859-1 alone, or any other
8-bit character set and encoding.

We get by in English and a few European languages with Windows code
pages, MacRoman, 8859-x and the like much less painfully than we used
to with 88-character typewriters, but there is still a lot missing.
The basic computer character set since PostScript came into
widespread use includes Dingbats, Symbol, the even older box-drawing
characters, and more. I have been told that typesetting shops used to
feel that 400-500 glyph types was adequate for English, including
elementary math and dingbats, but no graphics characters. Math needs
more than 800 characters, although no individual mathematician is
likely to use anywhere near that many. Still, math and science
education worldwide also has to be considered a *need*.

The correct statement is that everybody in the world needs more than
the lower 256 characters, but for e-mail and news *some* of us can
get by on just them.

Then there are the few like us on this list. I use Latin, Russian,
Greek, and Hebrew alphabets, Chinese, Japanese, Korean, math, and
APL, and there are others here with more extensive lists.

Edward Cherlin, Spamfighter <>
"It isn't what you don't know that hurts you, it's
what you know that ain't so."--Mark Twain, or else
some other prominent 19th century humorist and wit

