RE: An Aburdly Brief Introduction to Unicode (was Re: Perception ...)

From: Marco Cimarosti (
Date: Fri Feb 23 2001 - 04:53:56 EST

Paul Keinanen wrote:
> Regarding how to describe Unicode in the public, I think it is best to
> say that it can encode more than a million characters, of which about
> 100000 (in 3.1) is used. It is better to defer the discussion of any
> transformation forms to a much later stage.

I don't agree.

UTF's are the *surface* of Unicode, so many people's first (or only) contact
with Unicode is when they meet terms like "UTF-8" or "Unicode (Big-Endian)"
in the head of their HTML files, or in the user interface of their word

I feel that there is a need for short but accurate explanations of what
these acronyms mean.

In particular, simply saying that Unicode has one million possible
characters creates unjustified alarm about the need of huge memory
requirements and/or complicated specialized applications.

Particularly, not mentioning the existence of UTF-8 hides the reassuring
fact that Unicode may be as compatible with ASCII-oriented applications as
any other encoding systems. And this is an information of a great practical

(And you should not forget a fastidious but unfortunately true fact: people
making decision are always too busy to listen to long explanations, and
almost always too idiot to understand them. So either you prepared short and
well-conceived explanations to convince them, or you have a gun and no
witnesses around, or they will very likely make wrong decisions. ;-)

_ Marco

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:19 EDT