Re: Multilingual Documents [was: HTML forms and UTF-8]

From: Paul Keinanen (
Date: Tue Nov 23 1999 - 03:45:32 EST

On Mon, 22 Nov 1999 11:15:32 -0800 (PST), "Becker, Joseph"
<> wrote:

>What Chris says matches the results of market studies we (Xerox) did on
>multilingual systems. It is globalization and connectivity that create the
>value of one-world architecture; multilingual documents are a pleasant bonus
>for those of us who need or enjoy them.

Did you also look in countries that are officially bi- or
multilingual, such as Belgium, Finland, Switzerland etc.

In Finland, in which Finnish and Swedish are both official languages
and in order to get a public office, you have to pass a language test
in the other language that is not your mother tongue. (In addition
Sami has an official status in the most Northern parts of the

Previously it was quite common with bilingual material from various
state and municipality organisations and private companies, with
either the same text printed in both languages on the same page, or
one side of the paper could be in one language, while the other side
was in a different language. However, with computers, the language of
an individual is usually registered, so it should be easy to send the
correct monolingual version to each individual, thus saving a large
amount of paper. However, there are sometimes errors in these
postings, which usually provoke a large number of remarks to letters
to the editor department of the local newspaper :-).

In situations in which the language of an individual can not be known,
such as in advertisement, it is common to use both languages in the
advertisement in districts with a mixed bilingual population. Price
tags are often bilingual with the numbers in the middle. Product
packages are bilingual as well as user instructions for them etc.

When bilingual (or trilingual with English) HTML pages are used, they
are usually divided into separate directory trees, thus the user only
has to select the language once.

Nowadays bilingual documents are avoided if the preferred language of
the recipient is known, but this is not always the case, so bilingual
documents are still needed.

From character encoding point of view Finnish and Swedish are quite
easy, with nearly identical encoding even in the 7 bit era and with
the introduction of ISO-8859-15 (Latin 9), the remaining problems have
been reduced, also solving some problems with Sami.

The situation is going to be interesting in the Baltic states, with a
very large Russian speaking minority. Russian does not currently have
any official status (due to understandable historical reasons) in
these countries, but my guess is that there is going to be a lot of
advertisement and other activities related to private economy in areas
with mixed population, although it may take quite a long time, until
official documents will be available also in Russian, if ever. So I
guess is that a lot of software capable of handling both Latin and
Cyrillic text well be needed in these countries.

In some former Soviet areas there seems to be activities in order to
convert (back) from Cyrillic to Latin scripts when writing the local
language. While this is not a bilingual situation, document processing
capable of handling both Cyrillic and Latin scripts will be needed.

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:56 EDT