Re: Japan loves UNICODE

From: Markus Kuhn (Markus.Kuhn@cl.cam.ac.uk)
Date: Thu Apr 27 2000 - 09:55:12 EDT


"mary ink" wrote on 2000-04-27 11:22 UTC:
> >From: Markus Kuhn <Markus.Kuhn@cl.cam.ac.uk>
> >For some obscure historical reasons and misunderstandings, "Unicode" is
> >a taboo word in Japan. Simply replace in everything you say to Japanese
> >people the word "Unicode" by "JIS X 0221", and your problem is solved.
> >The two are technically the same, but the later is a Japanese National
> >Standard and therefore it is the only politically correct way of talking
> >to Japanese geeks about character sets.
>
> Huffily dismissing criticism that has "already been hashed out
> here many times" or calling objectors "Japanese geeks" doesn't really
> present a friendly face to the end user who may have legitimate cultural
> objections to how the code has been created and implemented, as well as
> technical concerns that need to be addressed.

"Geek" is just the precise word that Japanese friends have used when
they described and characterized to me the typical "Senior Japanese
Information Technology Expert" that still pronounces to the world how
horribly unacceptable Unicode is for Japanese users. These few but vocal
anti-Unicode protagonists represent merely a vanishingly small minority
that argues about highly academic and obscure special cases without
being able to provide specific realistic examples or even demonstrate
competence in familiarity with ISO 10646.

If there are "legitimate cultural objections", I'd be happy to see and
discuss them using specific examples, but as long as Unicode is a
superset of JIS X 0208 and JIS X 0212 as well as an implementation of
JIS X 0221, the criticism is of the sort that hits Japan's national
legacy encodings just as well.

A vast amount of Japanese information processing is already done in
Unicode anyway today, and if you add an ISO 2022 syntax to the HL7
standards, I guarantee you that almost all end-system implementations
will internally convert this back to Unicode anyway. The only thing that
adding non-Unicode character sets to protocol specifications gives you
is the requirement to add huge conversion tables to the implementations.
No real problems are solved this way, because the standard operating
systems all move very quickly to JIS X 0221 = ISO 10646 as their native
character encoding.

Markus

-- 
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org,  WWW: <http://www.cl.cam.ac.uk/~mgk25/>



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:02 EDT