Re: Unicode & Han

From: Edward Cherlin (
Date: Sun Aug 11 1996 - 13:03:13 EDT

>Dear Michael,

Edward, actually, or better U+9ED8, U+96F7. (Don't you wish we had Unicode
mail? Note for the Unicode list: Perhaps we should apply to Alis Software
in a body to be testers for their Unicode E-mail product when it gets
Chinese and a few other languages. I'm sure we could do some serious damage
to their preconceptions. I think I will go ask them.)

>Thanks for the information about companies using Unicode. I checked the
>web you gave me. Thanks. I only hope that Apple and the whole world all
>use it soon so that I can submit my Chinese articles to the magazine
>publishers here in Taiwan and mainland China without problems.

Apple promises good Unicode support in Copland, which was supposed to be
out by now. Delays are nothing new in the world of software, however. (Oh,
yes, and how about all of us Mac users on the list applying to Apple in a
body as testers? I think I'll go set that in motion, too.)

>And also
>hope the day will come very soon so that I don't have to find out that
>some characters I want and must use (such as a character for a given
>location or in someone's name...the characters for the
>elements above 100 or so...
If you have a list, make a proposal. Talk to the people doing the Han parts
of Unicode. Talk to the other Chinese standards groups. We depend on people
coming forth with needs to help us decide what to add next.
>About the Microsoft, I'll give you the source of the news when I can
>find the clip. Right now, at the moment of writing this reply, it's in
>one of the piles somewhere in my corner. When I find it, if you give me
>your fax number, I can even fax you that. I saw the news, I think
>somebody may be interested to know.


>As for the character/glyph definitions, I know how Unicode define them.
>However, defining them and following them are two separate matters, as
>in preaching and doing. Confusions will rise when someone says one thing
>and do the other. If Unicode want to use the pragmatical definitions,
>it's perfectly OK. However, then, it should not be claimed or projected
>to be one of the Character codes. Please let's keep the thing simple. If
>it is a character code, let it contain only the characters, not the
>garbages. Otherwise, the little user like us will be totally lost.

I don't understand your objection. Source character code compatibility is
extremely important, and Unicode clearly identifies the characters that
could be considered as "garbage" which should not be used in new documents.
Implementors can protect little users by displaying such characters
properly but making it hard to input them.

>So, from what you said which "character" got in Unicode depended on who
>put them in. Well, I hate to be Chinese now, because our representitives
>failed us badly. I don't and won't believe that Unicode people have any
>intention of being cultural bias, but the results been printed on black
>ink show that unfortunately.

It is certainly true that the lack of imperial names shows a cultural bias
among the Communist regime on the mainland. Is that what you mean?

>Another suggestion to the Unicode, I think
>it should invite some companies, scholars, and end users from other
>countries to be in the very kernel, not just the rich American
>companies. I say this with all my due respects and sincere hopes.

Please take another look. The rich American companies are there because
they put in their money and promised to use Unicode. Other companies can
join on the same terms. But the rich American companies aren't defining
Unicode except by contributing expertise (Huan Mei Liao of Apple, for
example). Scholars of languages and experts in implementing script systems
on computers are defining it. There has been considerable input from
Chinese scholars worldwide, and from Chinese experts in software. See the
acknowledgements pages v-vi in Unicode Standard 1.0 Vol. 1. The Principal
Investigator for Han unification is Dr. Yang Xiao-jie. The National Library
of China also participated. Since you are involved in a character set
standard, you could join this effort and be in the kernel yourself.

>Speaking of Ma Jong font, if you like, I think we can make one. I can
>easily scan the images, and you can use some font making software, such
>as fotographer, to make them a MaJong font.

Well, of course there is no difficulty making one. I don't need it myself,
but I can suggest some font vendors who have Western chess fonts and might
be interested in it, and also Chinese, Korean, and Japanese chess fonts.
Ishi Press, Yutopian, and other publishers of books on go (wei-chi, padook)
and Asian chess could certainly use them. I don't know who publishes on Ma
Jong, but I know there are some.

>If you feel that I told you that you don't respect our culture, please
>accept my apology. I don't mean that personally. In fact, you are a
>better Chinese than I am -- I don't know how to play wei-chi and cook
>Sichuan style.

Now you are being much too polite.

>I just want everybody to contribute whatever they can so
>that we can have a better Chinese Computing environment to use (now).
>And if you read all my letters posted, you can see I DO respect the
>Unicode greatly. The thing I am complaining is that the current end
>results are far from the nice image it projected. And that was caused by
>Unicode guys dealing with the wrong Chinese delegates.

I'm not sure that dealing with the wrong people was the cause, but I don't
think we need to argue the point. What I think we agree on is that there
are some hundreds of characters that could and should be added to Unicode,
using the procedure in appendix D of the standard, that weren't in the
standards used. There are also some tens of thousands of characters which
won't fit into Unicode, but should be added as soon as possible to ISO
10646. And somebody needs to *implement* ISO 10646. Preferably a group of
universities, since scholars will be the ones to benefit most from it. They
could do an open implementation which would allow anyone to add new scripts
for testing purposes, and they could make source code publicly available.
Their respective presses should be willing to provide some funding, and
there are certainly foundations eager to support such an effort.

>In my understanding on the character coding issue, there is already an
>existing American National Standard, EACC -- ANSI Z39.64, used by the
>libraries worldwide. Why, then, putting so much efforts re-inventing the
>wheel? The originator of the EACC, the CCCII already compiled 75,684
>ideographic characters (in 1989). Why couldn't we start from that and
>support that? Wasting precious resource is not one of the things a
>Buddhist should do.

"Characters from other standards have also been included, specifically,
from bibliographic standards used in libraries (ANSI Z39.47-1985 [Roman]
and ANSI Z39.64-1990 [East Asian]...The Unicode standard version 1.0 does
not encode rare, obsolete, idiosyncratic,...or private-use characters."
Unicode Standard version 1.0, p. 3.

Anyway, you gave the answer yourself: 75,684 characters, more than can fit
into Unicode. So it will go into ISO 10646, and I think everyone here
supports that effort.

>Timothy Huang



Edward Cherlin Helping Newbies to become "knowbies" Point Top 5%
Vice President of Web sites
NewbieNet, Inc. Everything should be made as simple as possible, __but no simpler__. Albert Einstein

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:31 EDT