RE: Multilingual Documents

From: Hohberger, Clive (CHohberger@zebra.com)
Date: Wed Nov 24 1999 - 14:24:59 EST


> -----Original Message-----
> From: Marco.Cimarosti@icl.com [SMTP:Marco.Cimarosti@icl.com]
> Sent: Wednesday, November 24, 1999 12:37 PM
> To: Unicode List
> Subject: RE: Multilingual Documents [was: HTML forms and UTF-8]
        [Hohberger, Clive P.] <snip>

> The fact is that "multilingual documents" have never been a problem, as
> far
> as all the involved languages share the same character set. The real
> problem
> is with *multi-script documents*, and I guess that this shrinks the ratio
> even more.
>
        [Hohberger, Clive P.] <snip>

        A classic "MULTI-SCRIPTING" problem arrises with Japanese documents,

        particularly technical papers. As I'm sure everyone knows, aside
from the
        Chinese characters (Kan-ji), Japanese writing also uses the phonetic

        alphabets Hiragana for Japanese language words and Katakana for
foreign
        words (words of non-Japanese origin) . In addition, there are often
Latin
        and Greek characters used in equations and for technical
terminology.
        Often phrases of English are embedded using Latin characters
(Roman-ji)
        in the middle of Japanese text.

        The Japanese Industrial Standard JIS-X-208 (last revised
1997)attempts
        to accomodate this by including the basic Latin, Greek and Cyrillic
        alphabets within the total character set. This (and the Shift-JIS
pseudo-
        transformation) work fine as long as there are, for example, no
glyph
        variants such as accented Latin characters. But trying to embed a
        French, Swedish, Czech or Vietnamese phase is a nightmare in JIS
        208 and Shift-JIS writing systems... when it can be done at all.
        Basically its like me trying to embed Kanji in an English
document...
        I usually do it by converting it to a graphic and embedding the
graphic.

        One of the challenges will be to take full advantages of the
capabilities
        of Unicode in multiscripting in Japanese word processing systems.

        Clive



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:56 EDT