Re: Possibilities of future expansion

From: Michael Everson (
Date: Mon Feb 26 2001 - 06:31:11 EST


>I know this has been hashed over time and time again, and the answer has
>been handed down as if by edict time and again, but _your_ attitude as
>expressed below is taken by many who are not involved as rather arrogant.

I can't see why. We have very clear rules. Code points are assigned
to characters which have been agreed by the two standardization
committees to meet various criteria. These criteria are fairly clear,
and the rules are not difficult to understand.

The standard also provides an area for private use. And the standard
specifically states that no meaning is assigned to those positions,
and that people who use them shall not assume that anyone else will
understand them. The UTC and WG2 have resolved NEVER to assign
meanings to any of the code positions in the PUA. What could be

>To many people, it seems like the UNICODE has taken in hand to define language

1. "Unicode" isn't an acronym, and should not be written in all caps.

2. People who think that this is what Unicode and ISO/IEC 10646 are
doing should read the text of Unicode and ISO/IEC 10646 to learn
what the scope of these standards actually is.

> The explanations, no matter how well founded, sound to ordinary
>people like slick lawyers trying to cover up something bad with legalize.

I don't believe this. Sorry, but I don't believe it. Which "ordinary
people" are these?

>This is why I wrote earlier to sugest that the standard should focus on
>commonality rather than universality.

What does this mean?

>Rather than trying to hold the
>standard for _all_ characters under one organization, hold the standard for
>a small, workable subset under the organization, standardize a method for
>groups external to the organization to register their own sets and
>(necessarily partial) mappings into the common set, and let the people
>closest to each language take their own good time about registering.

Unicode and ISO/IEC 10646 is the Universal Character Set. It is a
one-stop solution for encoding the characters used by the writing
systems of the world. The kind of "registry" you are talking about is
defined by ISO/IEC 2375, an ISO/IEC 2022-based scheme for
registration of coded character sets accessed by escape sequences.

The ISO/IEC 2022 solution was technically feasible, but industry did
not, by and large, accept it and it was not widely implemented.
Instead, we have Unicode, a single coded character set managed by an
ISO/IEC Working Group and an industrial consortium.

>The smaller approach will meet less resistance, be more flexible, and take
>less time.

The fact that all the major players on the computing scene are
implementing Unicode means that Unicode will soon be on everyone's
desk. I can't see who is going to resist that.

Arrogant? Me? Sometimes. Here? I don't know. These are the facts.

Best regards,

Michael Everson  **  Everson Gunn Teoranta  **
15 Port Chaeimhghein Íochtarach; Baile Átha Cliath 2; Éire/Ireland
Mob +353 86 807 9169 ** Fax +353 1 478 2597 ** Vox +353 1 478 2597
27 Páirc an Fhéithlinn;  Baile an Bhóthair;  Co. Átha Cliath; Éire

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:19 EDT