RE: [OT]non-terrestrial writing systems

From: Philippe Verdy (
Date: Thu May 31 2007 - 19:35:29 CDT

If I just consider the current data found in the CLDR language-territory
map, we currently have a total of 443 languages spoken by 5,972,581,880

So there are still lots of people speaking uncovered modern languages (rough
estimate, about 1.5 billion) possibly more because those CLDR estimates are
only for the primary language:

if I look for example at France, the CLDR data says that French is spoken
only by 51 millions people out of more than 71 millions residents, and an
incredibly large 16 millions people (but most probably only as a secondary
language; and it forgets more common primary language spoken in France by
French natives: Arabic, Berber, Rom/Tzigane, Armenian, and lots of other
languages spoken by more recent immigrants with a legal residence: the same
languages as well as Romanian, Polish, Chinese, Vietnamese, Persian,
Turkish, and many African languages...). (Note that the CLDR data includes
statistics for migrants, but minors the statistics for French-speaking US

This just confirms that the CLDR data just concentrates on the primary
language, or at a official lingua franca for languages spoken by a community
spread in very small minorities over a territory, and that are not directly
identifiable. But the same data contains statistics for old regional
languages, even though most of them are only spoken as a secondary language.
The case of English in France is very significant.

I'm sure that those statistics are tweaked in favour of more important
languages, but even in this case, they are missing lots of people in the
world; notably: there's data missing for the languages in:
* [MX] Mexico, [BO] Bolivia and [PE] Peru: lots of Amerindian languages
* [MQ] Martinique (France): French is given very low statistics, probably
French Creole is missing (but in Guadeloupe, the statistics indicate
standard French spoken by everyone, without any creole?)
* [DZ] Algeria and [TU] Tunisia: missing Berber, Fulah (Peul)...
* [CG] Congo-Brazzaville, [CM] Cameroon: missing lots of African languages
* [CI] Côte d'Ivoire: missing lots of African languages, or statistics are
most probably about 100 times too low if considering only the lingua-franca
languages (only French and Koro?), possibly a input bug! Missing English
* [GM] Gambia: lots of African languages
* [ZA] South-Africa: only the official languages are listed, plus Swati,
Swahili, South-Ndebele, Hindi being the only non African language listed
(where is also Chinese?)
* [RE] Reunion (France): Reunion French Creole is listed along with Tamil,
but Chinese is missing* [SC] Seychelles: where are Indian languages?
* [JE] Jersey and [GG] Guernsey: where are English, Normand, Jersiais and
* [GI] Gibraltar: most probably, Spanish is missing there.
* [RU] Russia: many Asian languages (including Chinese and Mongolian) and
German, Yiddish, Hebrew...
* [CN] China (Dem. Rep.), [MO] Macau SAR, [HK] Hong Kong SAR: missing
Southern Chinese dialects, plus Hmong and Turkic languages.
* [MS] Malaysia: lots of native languages
* [PH] Philippines, [TH] Thailand: their native languages are spread all
around the world through navigation
* [ID] Indonesia: certainly lots of native languages missing
* [CK] Cook Islands: missing Cook Islands Maori (only English listed)
* [NC] New-Caledonia (France): missing native polynesian languages
* [WF] Wallis-and-Futuna (France): missing native polynesian languages

Most missing languages are in South-East Asian archipelagos, India, China,
all over Africa, Central America, and North-West of South America. Only
European languages and large Asian languages are "well" covered at least
with the primary language plus some regional languages.

And anyway, we still lack resources for important historic languages in the

> -----Message d'origine-----
> De : [] De la
> part de Don Osborn
> Envoyé : jeudi 31 mai 2007 23:02
> À :; 'Daniel Yacob';;
> Objet : RE: [OT]non-terrestrial writing systems
> One could start right in New Mexico, after all:
> . Just
> don't use those characters willy-nilly - wrong combination might get us
> all
> into big trouble.

This archive was generated by hypermail 2.1.5 : Thu May 31 2007 - 19:37:30 CDT