Re: Unicode CJK Language Myth

From: Edward Cherlin (cherlin@snowcrest.net)
Date: Mon May 27 1996 - 04:10:28 EDT


handa@etl.go.jp (Kenichi Handa) wrote:
>I'm not claiming to distinguish ALL the different ways, but claiming
>that Unicode should have distinguished more reasonalble variants.
>Although the word "reasonalble" is vary vague, the current unfication
>of Unicode appears unreasonable to many Japanese. If we don't have to
>worry about the possibility of, for instance, character "choku" being
>shown in an unexpected way, more and more Japanese people accept
>Unicode.
[rest of discussion snipped]

Very well. Under what circumstances would a Japanese see the character
"choku" rendered incorrectly when using Unicode?

A Japanese person using Unicode-enabled Japanese-language software with
Unicode-encoded Japanese fonts will continue to see the expected rendering.

Anybody using Chinese and Japanese fonts together might see either the
Chinese or Japanese rendering. Presumably someone who uses Chinese fonts
has a specific reason for doing so--perhaps the person is Chinese, or is a
scholar of Chinese, or is doing business in China--but whatever the reason,
that person has to accept that Japanese and Chinese fonts are different,
and learn the significant differences.

So who is left, who could have a valid objection to unifying the glyphs
into one character?

It is an important principle of Unicode that glyphs distinguished in any
national character set must be assigned distinct characters in Unicode.
Since no Japanese or Chinese character set standard distinguishes these two
glyphs, the conclusion is that nobody feels the need to make the
distinction at the character level. The differences occur only when using
two distinct character codes and two distinct fonts, as may happen in a
bilingual dictionary or language textbook.

How do the objectors to Unicode handle the problem of rendering "choku"
now? Either it doesn't present a real problem, or they use separate
Japanese and Chinese fonts with incompatible codings. Is this an advantage
to anyone?

Have I missed something? Is this style of explanation satisfactory to
Japanese computer users?

On this point, I find it far more confusing when similar glyphs are not
unified into one character, is in the case of ASCII angle brackets '<>',
and mathematical angle brackets, which are taller and have a wider angle.
However, I consider the benefits of a single character set to be far more
important than such a minor inconvenience. We could (and we will) argue
about the merits of a number of Unicode character definitions, but that
should not block us from using Unicode.

Unicode is essential for global software development, and significant even
for monolingual products and applications where good typesetting or math
are of any importance. ASCII and the various 8-bit extensions to ASCII are
all entirely inadequate for English or any other Latin alphabet language.
Double-byte encodings of CJKV languages are worse, from a technical point
of view. They are difficult to process correctly on a computer, are
incompatible with each other, and do not offer enough character code points
for scholarly applications. They do nothing to help developers or users to
display "choku" correctly.

Edward Cherlin Helping Newbies to become "knowbies" Point Top 5%
Vice President http://www.newbie.net/ of Web sites
NewbieNet, Inc. Everything should be made as simple as possible,
(916) 938-4684 __but no simpler__. Albert Einstein



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:31 EDT