RE: Multilingual Documents

From: Hohberger, Clive (
Date: Wed Nov 24 1999 - 14:24:59 EST

> -----Original Message-----
> From: []
> Sent: Wednesday, November 24, 1999 12:37 PM
> To: Unicode List
> Subject: RE: Multilingual Documents [was: HTML forms and UTF-8]
        [Hohberger, Clive P.] <snip>

> The fact is that "multilingual documents" have never been a problem, as
> far
> as all the involved languages share the same character set. The real
> problem
> is with *multi-script documents*, and I guess that this shrinks the ratio
> even more.
        [Hohberger, Clive P.] <snip>

        A classic "MULTI-SCRIPTING" problem arrises with Japanese documents,

        particularly technical papers. As I'm sure everyone knows, aside
from the
        Chinese characters (Kan-ji), Japanese writing also uses the phonetic

        alphabets Hiragana for Japanese language words and Katakana for
        words (words of non-Japanese origin) . In addition, there are often
        and Greek characters used in equations and for technical
        Often phrases of English are embedded using Latin characters
        in the middle of Japanese text.

        The Japanese Industrial Standard JIS-X-208 (last revised
        to accomodate this by including the basic Latin, Greek and Cyrillic
        alphabets within the total character set. This (and the Shift-JIS
        transformation) work fine as long as there are, for example, no
        variants such as accented Latin characters. But trying to embed a
        French, Swedish, Czech or Vietnamese phase is a nightmare in JIS
        208 and Shift-JIS writing systems... when it can be done at all.
        Basically its like me trying to embed Kanji in an English
        I usually do it by converting it to a graphic and embedding the

        One of the challenges will be to take full advantages of the
        of Unicode in multiscripting in Japanese word processing systems.


This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:56 EDT