Re: Unicode & Han

From: Martin J Duerst (mduerst@ifi.unizh.ch)
Date: Sat Aug 10 1996 - 09:27:43 EDT


Timothy Huang wrote:

>Dear Michael,
>

>As far as my statement on Microsoft looking for other coding scheme, if
>you can read Chinese computer news, you will know I am LIVE RIGHT. The
>captain is abandoning the ship. Why? Because the coding structure and
>the implementation of Unicode are DEADLY WRONG. For example, can anyone
>tell me what is the definition of a Character? And what is a glyph? In
>version 1.0 of Unicode book, code 337B ~ 337F, can these be called as
>characters? If so, I can give you many more examples in Chinese. And
>then, why the Japanese emperor's names can be coded, but not the
>Chinese? In Chinese history, there were more than 500 emperors, some had
>more than one name. Why the Wester Chess symbols were coded as
>characters?, but not the Chinese Mar-jhon? Cultural superiority?

In Japanese history, there were also over 100 emperors, and many of
them had more than one name, but Unicode only contains premade
combinations for the latest five of them, from the modern area.
They are there obviously for compatibility reasons, on request of
the Japanese delegation or some US vendor that used it in their
Japanese system version. They are not contained in the official
Japanese standards (JIS 208 nad 212). Obviously, there was no
request in this direction from China or from Taiwan. Neither of
them, in contrast to Japan, currently use emperor's names to
count years. For Chinese Chess or Mah-jong, obviously also
nobody made any request to include them. Unicode is not sealed
off, you can still make a request through your national standard
body if you can document it well enough (which should not be
that much of a problem in these cases).

>I think
>the root of the problem is that the Unicoder DOES NOT understand what is
>a character. And this is the deadly vital problem. And in my opinion,
>until the Unicoders start to respect different culture and language,
>they won't be able to do the coding right.

Unicode pretty well understands what a character is. But they also understand
that they have to consider the past, with backwards compatibility. And the
designers of the past did not always understand what a character is.

In addition, in China or in Japan, people have very different oppinions
about what a character is. Even the same person can use the same word
differently depending on circumstances, even in the same sentence.
Among "experts", the differences may get very pronounced. It's not
as easy as "respect Chinese culture" or "respect Japanese culture".
Apart from this, and from backwards compatibility, Unicode does
whatever it can to encode characters in a consistent way.

Regards, Martin.



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:31 EDT