RE: Unicode 3.0 press statements

From: Marco.Cimarosti@icl.com
Date: Thu Jan 20 2000 - 12:05:49 EST


> a) catch people's attention

"Unicode can be used to write the top 100 languages in the world, as defined
by 'The Ethnologue' (http://www.sil.org/ethnologue/top100.html)!"

> b) be true!

It has already been evidenced how such a statement would be quite nonsense
to many linguists. Most of us probably speak 2 or more of these languages,
so how many times have we been counted!?

Moreover, I would be more confident about the truth of this if I just knew
what scripts are used by the 21 languages in my "Unknown" category. If
someone could help categorizing them, I could at least do the sums, to
discover who wins betwen Latin and CJK.

Ciao.
        _Marco

*** Unknown (by Marco Cimarosti, that is :-)
 27000000 SUNDA
 25000000 BHOJPURI
 24260000 MAITHILI
 20540000 AWADHI
 19720000 SINDHI
 17000000 IGBO
 15015000 SARAIKI
 15000000 CEBUANO
 14000000 CHITTAGONIAN
 13694000 MADURA
 13000000 HARYANVI
 12104000 MARWARI
 12000000 MAGAHI
 10985000 CHHATTISGARHI
 10709800 DECCAN
  8920000 WEST-CENTRAL OROMO
  8000000 ILOCANO
  7611000 NIGERIAN FULFULDE
  7000000 SHONA
  7000000 KURMANJI
  7000000 HILIGAYNON
  7000000 AKAN

*** Latin script
332000000 SPANISH
322000000 ENGLISH
170000000 PORTUGUESE
 98000000 GERMAN
 75500800 JAVANESE
 72000000 FRENCH
 67662000 VIETNAMESE
 59000000 TURKISH
 44000000 POLISH
 37000000 ITALIAN
 26000000 ROMANIAN
 20000000 DUTCH
 17050000 INDONESIAN
 17000000 TAGALOG
 14500000 HUNGARIAN
 12000000 CZECH
  9398700 MALAGASY
  9306800 RWANDA
  9142000 ZULU
  9000000 SWEDISH
  8974000 LOMBARD
  7372000 HAITIAN CREOLE FRENCH
  7047400 NAPOLETANO-CALABRESE

*** Greek script
 12000000 GREEK

*** Cyrillic script
170000000 RUSSIAN
 41000000 UKRAINIAN
 24364000 SOUTH AZERBAIJANI
 18466000 NORTHERN UZBEK
 10200000 BELARUSAN
  9000000 BULGARIAN
  8000000 TATAR
  8000000 KAZAKH
  7595512 UYGHUR
  7059000 NORTH AZERBAIJANI

*** Arabic script
 58000000 URDU
 42500000 EGYPTIAN ARABIC
 24280000 WESTERN FARSI
 22400000 ALGERIAN ARABIC
 19542000 MOROCCAN ARABIC
 18900000 SAIDI ARABIC
 16000000 SUDANESE ARABIC
 15000000 NORTH LEVANTINE ARABIC
 13900000 MESOPOTAMIAN ARABIC
  9800000 NAJDI ARABIC
  9685000 NORTHERN PASHTO
  9308000 TUNISIAN ARABIC
  8206000 SOUTHERN PASHTO
  7600000 SANAANI ARABIC
  7000000 EASTERN FARSI

*** Devanagari script
182000000 HINDI
 64783000 MARATHI
 16056000 NEPALI

*** Bengali script
189000000 BENGALI
 14634000 ASSAMESE

*** Gurmukhi script
 30000000 WESTERN PANJABI
 26013000 EASTERN PANJABI

*** Gujarati script
 44000000 GUJARATI

*** Oriya script
 31000000 ORIYA

*** Tamil script
 63075000 TAMIL

*** Telugu script
 66350000 TELUGU

*** Kannada script
 33663000 KANNADA

*** Malayalam script
 34022000 MALAYALAM

*** Sinhala script
 13220000 SINHALA

*** Thai script
 15000000 NORTHEASTERN THAI
 20047000 THAI

*** Myanmar script
 22000000 BURMESE

*** Ethiopic script
 17413000 AMHARIC

*** Khmer script
  7039200 CENTRAL KHMER

*** CJK script (Chinese/Japanese/Korean script; includes Japanese and Korean
phonetic scripts)
885000000 MANDARIN CHINESE
125000000 JAPANESE
 77175000 WU CHINESE
 75000000 KOREAN
 66000000 YUE CHINESE
 49000000 MIN NAN CHINESE
 45000000 JINYU CHINESE
 36015000 XIANG CHINESE
 34000000 HAKKA CHINESE
 20580000 GAN CHINESE
 10537000 MIN BEI CHINESE
 10000000 NORTHERN ZHUANG

*** Languages using more than one script
 24200000 HAUSA (Latin and Arabic)
 21000000 SERBO-CROATIAN (Latin and Cyrillic)
 17600000 MALAY (Latin and Arabic)
 20000000 YORUBA (Latin and Arabic)
  9472000 SOMALI (Latin and Arabic)

*** ... Etcetera...



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:58 EDT