Re: Unicode & Han

From: Michael Everson (
Date: Sat Aug 10 1996 - 08:24:05 EDT

At 09:10 1996-08-10, Timothy Huang wrote:

>Dear Michael,
>I would like to be corrected if you can provide me more latest
>information I don't know of. May I ask, as far as you know, which
>version of Unicode, 2-byte or UTF-16 or whatever, is been used or under
>implementation by any computer company?

I myself use a Power Macintosh, running System 7.5.3, which does not yet
support Unicode. So I am in the same position as you -- waiting for an
implementation which I can use. I am patient, however, because I know that
Apple are taking their time to make an implementation that works well. I
currently have a Chinese implementation running on my Mac. It serves me
well enough. It doesn't have a million characters. I'd like to see Unicode
sooner rather than later too. It would make my fonts more interesting.

>As far as my statement on Microsoft looking for other coding scheme, if
>you can read Chinese computer news, you will know I am LIVE RIGHT. The
>captain is abandoning the ship.

I can only imagine that UTF-16 is what they are talking about. No, I cannot
read the Chinese computer news.

>Why? Because the coding structure and
>the implementation of Unicode are DEADLY WRONG. For example, can anyone
>tell me what is the definition of a Character?

"4.6 character. A member of a set of elements used for the organization,
control, or representation of data.
"4.18 graphic character: A character, other than a control function, that
has a visual representation normally handwritten, printed or displayed."

>And what is a glyph?

"4.18 graphic symbol. The cisual representation of a craphic character or
of a composite sequence."

>In version 1.0 of Unicode book, code 337B ~ 337F, can these be called
>as characters?

Of course they can. "Character" can have a different meaning than "hanzi"
does. See the definition above.

>If so, I can give you many more examples in Chinese. And
>then, why the Japanese emperor's names can be coded, but not the

Probably because those were used in *data* *processing* in an existing
standard because the imperial calendar is used by people in Japan.

>In Chinese history, there were more than 500 emperors, some had
>more than one name.

Look, Huang, the absence of a character in the standard is not an indicator
of the importance of one culture or another.

>Why the Western Chess symbols were coded as
>characters?, but not the Chinese Mar-jhon? Cultural superiority?

No. It had to do with the fact that these are used in typesetting and they
were probably included in a coded character set. Do you have data
processing requirements for mah-jong? Then you can talk to your national
representative about getting those entities encoded.

>I think
>the root of the problem is that the Unicoder DOES NOT understand what is
>a character. And this is the deadly vital problem. And in my opinion,
>until the Unicoders start to respect different culture and language,
>they won't be able to do the coding right.

You are overstating the case here. I'm a member of SC2/WG2, the ISO
committee which makes 10646. We cooperate with Unicode. I have some
problems with what I would call "over-decomposition" in Unicode philosophy
for alphabetic scripts. We work to come to consensus. It's not always easy.
The Ideographic Rapporteur Group has been working very hard assembling new
CJK characters for inclusion in the standard. If you are interested in the
work, you could try to influence it by giving input nationally. You caould
talk to the Taipei Computer Association, to Gary Twu of ACER in Taipei, or
T.C.Kao of TCA ( These names came from a WG2
mailing list. But complaining about it the way you have been doesn't help.
Your frustration is understandable -- but your script is complex. The work
is underway, even if it's not fast enough for you.

>Timothy Huang

Best regards,

Michael Everson, Everson Gunn Teoranta
15 Port Chaeimhghein Íochtarach; Baile Átha Cliath 2; Éire (Ireland)
Gutháin:  +353 1 478-2597, +353 1 283-9396
27 Páirc an Fhéithlinn; Baile an Bhóthair; Co. Átha Cliath; Éire

