Re: Unicode top 100 (was RE: Unicode 3.0 press statements)

From: Andrew Cunningham (andjc@ozemail.com.au)
Date: Fri Jan 21 2000 - 18:57:08 EST


Hi Everyone,

> I know, I know: the "top 100" languages list is utter non-sense and surely
> does not fit the public relation needs of The Unicode Consortium.
>
> However, as some people took the time to send me corrections and advice, I
> tried to integrate them in the list, just for our amusement.
>
> * John Cowan > "Azerbaijan has switched to Latin."
> [I moved it]
>
> * Joerg Knappen > Sunda uses Latin; Oromo uses Ethiopic.
> [I moved them]

I was under the impression that Oromo was now being written with
Latin rather than Ethiopic ...

Could anyone confirm or deny this? The only samples of ormo i've see on the
web have used latin, while the same site uses Ethiopic for Amharic, but not
Oromo.

>
> * Roozbeh Pournader > "Sindhi is written in Arabic script."
> [I moved it]
>
> * Thomas Chan > "... Other than Mandarin Chinese and Yue Chinese, the
> other "Chinese" ones don't really have developed writing traditions, so
the
> question is sort of academic..."
> [See next]
>
> * John Cowan and I > similar concern for Italian dialects.
> [I collapsed most "dialects" under the entry of the "national language"
> spoken in the area, assuming that speakers of these languages would use
the
> "national language" in writing (especially on computers)]
>
> * Kent Karsson > "... That does not even cover all of the official
> languages of the EU! So that "top 100" statement would be highly
> UNimpressive..."
> [EU languages are not more important than others; moreover many other
> languages are missing. Some of these languages (e.g. Hebrew) are relevant
> for Unicode because they use a special script, or are tricky, or are
"often
> used" on computers, so I sort of added them without estimates]
>
> * Janko Stamenovic > split Serbo-Croatian in Serbian (rough estimate:
> 8..10 millions) and Croatian.
> [The divorce is done: 10 millions to Serbian and the rest to Croatian]
>
> Here are the revised statement and the new list (ordered by writing
systems;
> the numbers show an estimate of the people speaking each language, in
> millions).
>
>
> "Unicode supports the top 100 languages. Unicode also supports all the
> official languages used in the EU and many other languages, some of which
> require unique writing systems."
>
>
> *** Latinate alphabet
> 332 SPANISH
> 322 ENGLISH
> 170 PORTUGUESE
> 98 GERMAN
> 76 JAVANESE
> 72 FRENCH
> 68 VIETNAMESE
> 59 TURKISH
> 46 ITALIAN
> 44 POLISH
> 31 AZERBAIJANI
> 27 SUNDA
> 26 ROMANIAN
> 24 HAUSA
> 20 DUTCH
> 20 YORUBA
> 18 MALAY (also written in Arabic)
> 17 INDONESIAN
> 17 IGBO
> 17 TAGALOG
> 15 HUNGARIAN
> 12 CZECH
> 11 CROATIAN
> 9 MALAGASY
> 9 RWANDA
> 9 SOMALI
> 9 ZULU
> 9 SWEDISH
> 8 NIGERIAN FULFULDE
> 7 HAITIAN CREOLE FRENCH
> (all other official languages in the EU)
>
> *** Greek alphabet
> 12 GREEK
>
> *** Cyrillic alphabet
> 170 RUSSIAN
> 41 UKRAINIAN
> 18 NORTHERN UZBEK
> 10 BELARUSAN
> 10 SERBIAN (also written in Latinate)
> 9 BULGARIAN
> 8 TATAR
> 8 KAZAKH
> 7 UYGHUR
>
> *** Armenian alphabet
> (ARMENIAN)
>
> *** Hebrew alphabet
> (HEBREW)
> (YIDDISH)
>
> *** Arabic alphabet
> 175 ARABIC (all dialects)
> 58 URDU
> 31 FARSI
> 30 WESTERN PANJABI
> 20 SINDHI
> 18 PASHTO
>
> *** Thaana alphabet
> (MALDIVIAN)
>
> *** Devanagari alphabet
> 182 HINDI
> 65 MARATHI
> 16 NEPALI
>
> *** Bengali alphabet
> 189 BENGALI
> 14 ASSAMESE
>
> *** Gujarati alphabet
> 44 GUJARATI
>
> *** Gurmukhi alphabet
> 26 EASTERN PANJABI
>
> *** Oriya alphabet
> 31 ORIYA
>
> *** Tamil alphabet
> 63 TAMIL
>
> *** Telugu alphabet
> 66 TELUGU
>
> *** Kannada alphabet
> 34 KANNADA
>
> *** Malayalam alphabet
> 34 MALAYALAM
>
> *** Sinhala alphabet
> 13 SINHALA
>
> *** Thai alphabet
> 35 THAI
>
> *** Lao alphabet
> (LAO)
>
> *** Myanmar alphabet
> 22 BURMESE
>
> *** Georgian alphabet
> (GEORGIAN)
>
> *** Hangul script
> 75 KOREAN (also uses CJK ideographs, a.k.a. hanja)
>
> *** Ethiopic script
> 17 AMHARIC
> 9 OROMO
>
> *** Cherokee script
> (CHEROKEE)
>
> *** Canadian syllabic script
> (INUIT)
>
> *** Khmer alphabet
> 7 KHMER
>
> *** Mongolian alphabet
> (MONGOLIAN)
>
> *** Braille patterns
> (many languages worldwide)
>
> *** Kana script
> 125 JAPANESE (also uses CJK ideographs, a.k.a. kanji)
>
> *** CJK ideographs (a.k.a. hanzi, kanji, hanja)
> 885 MANDARIN CHINESE
> 66 YUE CHINESE
> 282 (other Chinese dialects)
>
> *** Yi script
> (YI)
>
> *** Unknown (unwritten?)
> 25 BHOJPURI
> 24 MAITHILI
> 21 AWADHI
> 15 SARAIKI
> 15 CEBUANO
> 14 CHITTAGONIAN
> 14 MADURA
> 13 HARYANVI
> 12 MARWARI
> 12 MAGAHI
> 11 CHHATTISGARHI
> 10 DECCAN
> 8 ILOCANO
> 7 SHONA
> 7 KURMANJI
> 7 HILIGAYNON
> 7 AKAN
>
> THE END
>
> Ciao.
> Marco
>
>
>



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:58 EDT