From: Edward Cherlin (edward.cherlin.sy.67@aya.yale.edu)
Date: Wed Dec 01 1999 - 01:31:38 EST

Microsoft and Xerox market research found a small percentage of
documents that require more than one language, and even fewer that
require more than one writing system:

At 03:00 -0800 1999/11/22, Chris Pratley wrote:
> The multilingual documents we found while doing our customer
>visits were mainly government related, or created by people whose business
>was multilingual documents (e.g. translators, linguists). It is not the case
>that these documents are hard to create (after all, how hard is a bilingual
>French/English or German/English document to create, technically?). It is
>just not preferred. Most people create two versions of a document if they
>genuinely need multilingual versions of one document. Otherwise, they simply
>pick a language appropriate to the audience and use that exclusively.

On Mon, 22 Nov 1999 11:15:32 -0800 (PST), "Becker, Joseph"
<Joseph.Becker@pahv.xerox.com> wrote:

>What Chris says matches the results of market studies we (Xerox) did on
>multilingual systems. It is globalization and connectivity that create the
>value of one-world architecture; multilingual documents are a pleasant bonus
>for those of us who need or enjoy them.

Like me, producing and consuming documents containing various
combinations of math, APL, music, English, French, German, Russian,
Greek, Hebrew, Chinese, Japanese, and Korean.

Without question, the proportion of multilingual documents is small.
However, I believe that the average importance of multilingual
documents is much higher than the general average. In my market
research report on "The Global Impact of the Unicode Character Set
Standard" (published in 1994) I took a different approach. Rather
than sift through the multitudes of documents looking for the few
multilingual ones, I looked at situations where creation and
dissemination of multilingual documents would be natural if it were
possible, but where people had previously gotten used to doing
without, or were using tools unsuited to wide electronic document
interchange, such as handwriting and any of the hundreds of mutually
incompatible editors and word processors using mutually incompatible
character sets.

My question was not "What fraction of documents are multilingual or
multiscript?" but rather "In what situations would multilingual or
multiscript documents become the norm, if the tools were universally
available both to create and to view them?" There are millions of
people with such requirements or possibilities. They include

Multilingual governments, such as Québec, Belgium, Switzerland,
Afghanistan, the Russian Federation, and much of Africa.

Multiscript nations, such as India (10 official scripts), the Chinese
language region (mainly China/[Hong Kong]/Macao/Singapore/Taiwan)
(Big 5, GB, Pinyin, Zhuyin, and more), and Cyprus (Latin, Greek).

Multiscript international organizations, such as the UN and many of
its agencies (Latin, Cyrillic, Chinese, and sometimes others), the EU
(variants of Latin, plus Greek, and various economic and military
treaty organizations.

Multiscript languages such as Serbo-Croatian (Latin and Cyrillic),
Hindi/Urdu (Devenagari and Arabic), several central Asian languages
(Arabic, Cyrillic, and Latin), Mongolian (Cyrillic, Mongol). Both
Swahili and Turkish changed over in this century from Arabic to Latin

Religious communities, including Judaism (Hebrew, Yiddish, Ladino),
Christianity (Greek, Latin, Hebrew and sometimes Aramaic for sources,
plus Coptic, Syriac, Georgian, Armenian, Cyrillic, and modern
translations into a multitude of languages), Hinduism (all of the
Indic scripts and some others), Buddhism (Pali in Sinhalese,
Devanagari, and Thai scripts plus Sanskrit, Tibetan, Mongolian,
Chinese, Japanese, Korean).

Travellers (signage, ATMs, in-flight publications, and more)


Scholars and their students in history, literature, art,
international law, and other subjects.

Mathematicians and computer scientists


APL programmers

Teachers and students of languages


Librarians and users of library catalogs

International sports (especially soccer) and games (chess, go, bridge)

Multilingual search engines

Now, in the current fractured software market, it is impossible to
extrapolate from current usage of languages and scripts to the
situation of ready access to fonts and software that makes it no
longer a problem to create and exchange multilingual documents,
across all platforms and on the Internet.

We can't even see, in the current state of the market, how we will
ever create such software, even if we believe firmly that we will
someday. We have products such as

Microsoft Office 2000 (lots of functions, fair to middling Unicode
support, Windows now, Mac whenever)

Adobe FrameMaker (Windows, Mac, and Unix, rather limited script support)

Unitype Global Writer and Global Office (good Unicode support, Windows only)

Alis Tango Browser (good Unicode support, Windows only)

Users will want document creation tools that can update themselves
when additions to Unicode are published, by reading in the property
files, and also run identically on Windows, Mac, and Unix, palmtops,
and whatever else comes handy; font management tools that make it
possible to join fonts covering different character ranges together,
or to assign larger fonts to more than one script; editors for
customizing keyboard mappings and IMEs; dictionaries; spelling and
grammar checkers; and a whole lot more.

In early 1982 I was on a panel at a computer conference in San
Francisco where the question came up, about what people would do to
fill up the whole 640K DOS address space. After all, the proportion
of programs that needed even 256K was extremely small. I suggested
graphics, video, music, and databases. Now I will suggest that people
who will never learn Chinese would still like it to display correctly
in their browsers and elsewhere. I will also suggest that ready
access to materials in other languages on the Internet will result in
revolutions in language teaching and learning.

It seems to me that we suffer from the same myopia about writing
systems today that we did about memory and disk capacity back then.
We don't see a lot of multilingual usage today, when it is highly
impractical in general. (I have been creating multiscript documents
for two decades, some for publication and others for my own use. I
continue to test the tools, none of which are up to meeting my
requirements yet.) Lack of multilingual documents created with
today's tools doesn't tell us what we will be doing in five or ten
years. It does tell us that if we don't implement, we won't ever get
to see users.

Edward Cherlin
"A knot! Oh, do let me help undo it."
Alice in Wonderland

