Re: Unicode CJK Language Myth

From: Masayuki MORI (mori@ozawa.elec.keio.ac.jp)
Date: Tue May 14 1996 - 13:05:44 EDT


>To quote Lee Collins: " .................... Unless it is a mistake, there
>is no unification in Unicode that should cause a reader to think that one
>character is another or cause a reader to fail to identify a character if it
>appears in a different font."

I was unaware that both `choku' in "choku-setsu (in Kanji)" and
`zhi' in "yi-zhi (in Simplified Hanzi)" derived from the same
character (U+76F4 in Traditional Hanzi) until I saw the word "choku-setsu"
written in a Chinese-first Unicode font because their glyphs are quite
different. A Japanese who doesn't know Chinese could fail to identify
the character if it is written in a Chinese font.

Of cource Japanese users who don't know Chinese would use Japanese fonts
and those who know Chinese could restore correct glyphs in their brain.
But is it the right way? Can you imagine character sets that unify
`LATIN SMALL LETTER M' and `CYRILLIC SMALL LETTER EM' because they can
be distinguished by the context?

Well, Han unification is not bad. CJK share most of Han characters.
Glyphs that are not exchangeable and therefore should be
separated appeared after the glyph reforms in Japan and
China (PRC) in 1940's and 1950's. For Japan, the number of characters
whose glyphs were changed is around 800, so adding them to Unicode
seems not to be difficult. For China, the number will increase because
they also reformed radicals, which changed rarely used characters
including them. Frequently used characters should be separated for
a start. It's not theoretical but more practical than "the rule
unifies them."

Finally, I don't feel qualified to discuss Han issues -- I haven't
checked characters in Unicode and in current standards one by one
and my Chinese language level is 'hello, world.'
I just posted this because I would know better than her:

>Anecdote 1. "Yes, yes. I once took our NeXT Japanese product home, on my home
>machine, and showed it to my wife. I put up a hunk of JIS code chart on the
>screen preliminary to showing her how the Japanese input system works. The
>first words out of her mouth were: "Naaaaani sore? Chukokugo?" which, roughly
>translated, means: "What's *that*? Chinese?"
>
>I explained that it's the Japanese national standard (rendered in a very
>finely designed Morisawa Mincho font)... and she said she's never seen MOST of
>those characters..."

Japanese sentences are never written only in Kanji, so a sequence of Kanji
could be a Chinese sentence for her.

--
Masayuki MORI
mymori@dvpj.sony.co.jp since this April



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:31 EDT