SIL (was Re: Reopening RFC 1766 - Language Tags)

From: Edward Cherlin et al. (
Date: Thu Nov 16 1995 - 15:31:43 EST

Concerning the Summer Institute of Linguistics (affiliated with Wycliffe
Bible Translators) and its Ethnologue database of world languages
J"org Knappen <KNAPPEN@VKPMZD.kph.Uni-Mainz.DE> wrote:

>I don't want to enter a debate about SIL's attitudes,


>but I want to remark, that their database is flawed from the beginning.

Its greatest virtue is that it is readily accessible, even if not
technically perfect. I have the printed version, and there is a searchable
Web site at More than 6,000 languages are
listed and cross-referenced.

>Defining `language' as opposed to `dialect' is a very hairy, politically
>charged affair anyhow, therefore one should first state criteria of
>classification and afterwards try to follow those criteria as best as

Actually, we need the field data before we can attempt a definition. The
debate over definitions will not end in our lifetimes.

>However, SIL's definition of `language' is blurred, and they tend
>to go too deep into dialects, but not cutting at the same level for all
>dialects. It is just a thrown-together pile of information of different
>quality, not the product of real reasearch.

Ethnologue does not pretend to be original research. It states that it is
compiled from the best available data. If you know of better data, tell
them where to find it, or send it to them.

Ethnologue states, "Variants of the languages that are not distinct enough
to need separate literature are treated as dialects, and are listed under
the language and not as separate entries, unless attitudes or other social
factors are strong enough that they need to be treated as separate
sociolinguistic entities. For many entries, however, we lack information on
intelligibility, and so have followed our best sources as to what they
consider to be a language or a dialect." (11th edition, 1988, p. vii)

The information is certainly of quite variable quality, since the source
data are quite variable in quality. There are flaws, such as a dearth of
script information, which are relevant to our concerns. However, SIL
clearly states that corrections and further information are welcomed. If
the database is flawed, it would be better to fix it than complain about
it. SIL tries to address the dialect issue by noting degrees of mutual
comprehension. I personally prefer more rather than fewer listings,
especially where dialects have different names or are spoken in different

Anyway, here is a sample, chosen somewhat at random from the 11th edition
(1988). Under Sudan:

"Arabic, Modern Standard...Not intelligible with Sudanese Arabic or
Sudanese Creole Arabic...Official language..."

"Arabic, Sudanese (Khartoum Arabic)...Not intelligible with Modern Standard
Arabic ('school Arabic') or Sudanese Creole Arabic. Western Sudan
Colloquial arabic, Juba Arabic, and Khartoum Arabic have little
compatibility (Alan S. Kaye 1988). Lingua franca..."

"Kaye, Alan S. Review of Bjorn H. Jemudd and Muhammad D. Ibrahim, 'Aspects
of Arabic sociolinguistics' International Journal of the Sociolinguistics
of Language 1986. Language 64:1.210 (1988)"

On the face of it, that is a good, scholarly treatment of dialects. If
someone would care to cite specific flaws, we might be able to discuss

One more sample, under UK:


In my experience, mutual intelligibility of these dialects is good but not
guaranteed. I've heard Britons who seemed to be speaking English of which I
understood nothing. Again, this seems to be sound, cutting at the right

>For a comprehensive list of languages, one should asked university
>linguists (there exist some pretty complete compilations, some continents
>and language families are better covered than others).

Pointers, please, J"org. We can't discuss the merits of work we can't find.
SIL uses all published linguistic data, according to the book, and anything
that scholars care to E-mail or put on the Web.

>--J"org Knappen.

