Re: Unicode 3.0 press statements

From: Paul Keinanen (
Date: Thu Jan 20 2000 - 04:50:11 EST

On Wed, 19 Jan 2000 23:29:30 -0800 (PST), Edward Cherlin
<> wrote:

>At 11:41 -0800 2000/01/19, wrote:
>>We will be issuing a press release for Unicode 3.0, and I am working with a
>>press agent now. One thing that she was discussing is that it is very
>>useful to include numbers in the statements; it catches the journalist's
>>attentions. What she would like to see are statements like:
>>Covers 95% of all world languages
>>Covers 99% of all languages used in commerce
>>I thought it would be useful to query this group for sample statements
>>along these lines, statements that would both:
>>a) catch people's attention
>>b) be true!

>Don't talk about % of languages. Talk about people.

This does not solve the problem.

There are a lot of truly bilingual or trilingual people in the world.
If language is supported by Unicode and the other is not, into which
category do you put these people ?

>One of the popular almanacs used to have a single page list of the
>languages with the most speakers, and may still do so. I believe that
>Unicode now covers all of the scripts listed, which may well cover
>the primary language of 99% of the world population,

If you are referring to the Time Almanac, it lists 50 most common
languages, with the last five with about 20 million speakers each and
most likely the ranks between 51. to 70. would have more than 10
million speaker each. Since 1 % of the world population would be 60
millions, it is clear that the listed 50 most common languages
contains much less than 99 % of the world population.

The concept of primary language is also a bit problematic in bilingual
communities in which both languages are used daily and with two people
might use both languages to talk to each other during the day. The
concept of first language (as used in the list referenced above) is a
bit more accurate, but in bilingual communities, a child will learn
both languages in parallel. The language registered for a child into a
population registry or given at a census may vary quite a lot due to
political (e.g. fear of oppression), economical or ideological (e.g.
nationalistic) reasons, thus, this does not give an accurate count for
a specific language.

If you want a percentage that has to be true, then I don't think there
is an other way than to go through all the hundreds or thousands
languages in the world, check if it is supported by Unicode and if
not, add the number of speakers (a figure that is often quite
unreliable) and compare the sum to the total world population.
Unfortunately, counting it this way, will only give some minimum for
the Unicode support and it might be much less than 99 %.

Anyway, people are so used to xxx.99 prices in advertisement, so
general statements like "more than 99%" are not very creditable these
days. Thus, if a calculation gives 99.2 % with less than 0.1 % error
margins, I guess it would be more creditable to report it as "99.2 %"
than as "more than 99 %" :-).

> and certainly
>covers at least one language of 100% of people engaged in
>international trade, treaty organizations, finance, academic
>research, science, technology, publishing, and some other activities.

No doubt about that.

>It may be useful to point out that more than half of the world's
>population uses writing systems other than Latin alphabet, with China
>and India making up nearly a third of the total world population.

This is a good point.

Paul Keinšnen

