Re: Making orthographies computer-ready

Date: Tue Jul 30 2002 - 09:39:35 EDT

On 07/30/2002 12:21:05 AM "Doug Ewell" wrote:

>> OK, now you've hit a hot button: The industry needs to wake up to the
>> fact that the requirement that a language have an ISO-639 2-letter
>> code before a locale can be created is a dead end. There just aren't
>> enough 2-letter codes to go around, and ISO 639-2 has restrictive
>> requirements for doling out 2-letter codes -- it wasn't created for
>> the benefit of locale implementers, but for the benefit of
>> terminologists.
>And bibliographers.

No! The ISO 639-1 standard was developed by terminologists. The ISO 639-2
was due primarily to bibliographers (but the terminologists had a finger in
the pie).

>In any case, the real problem is not the ISO 639
>"50 documents" restriction.

ISO 639-1 is even more restrictive than just the 50 docs requirement.

> No language coding system -- ISO 639 or
>otherwise -- is sufficient to describe locales, because locales consist
>of more than just languages.

Of course, that's another side of the locales problem (and we've got a
separate list going to discuss such issues).

> Lumping together all English speakers in
>the world, for example, would be just silly. The standard solution is
>to append a country code, as though locale were simply a matter of
>language+country, which is almost as silly -- it assumes all English
>speakers in the U.S. can use the same settings, but German speakers in
>Switzerland and Liechtenstein require different settings.


>> Luiseņo and Tongva simply are not candidates.
>Luiseņo does have a 3-letter code (lui),

But not a 2-letter code, and isn't likely to get a 2-letter code.

>while Tongva has neither an ISO
>639 code nor an Ethnologue code (the on-line Ethnologue has no listing
>for either Tongva or Gabriel[ie][nņ]o).

The lack of Ethnologue code is probably due to the language having become
extinct before the development of Ethnologue began (circa 1950). There are
some languages listed in the Ethnologue that were extinct prior to that
date, but the inclusion of most is happenstance -- someone made a specific
request to the Editor, and the Editor complied. But there has never been
any effort to create a comprehensive listing of extinct languages. SIL is
now participating in the EMELD project, and we are coordinating language
cataloguing efforts with others. In particular, SIL and The Linguist List
staff have agreed on a division of labour: SIL will catalogue languages
that were living at or after 1950, while The Linguist List will catalogue
languages that became extinct prior to 1950.

>The requirement that it doesn't meet is that it already has a 3-letter
>code (haw).

I can see why you might say that, but the discussions didn't go quite like
that. Enough said.

>> Instead of asking for a 2-letter code, the engineers should have been
>> looking at what it would take to make the software support a 3-letter
>> code (which already exists in ISO 639-2).
>Again, RFC 3066 spells out very clearly how to do this.

Assuming the protocol one is dealing with relies on an RFC for language
codes. Note, though, that some protocols (e.g. ISO TR14652) reference ISO
639(-1) directly.

>People who do this are probably also the
>ones, when a telephone area code split or overlay occurs, who regard the
>new area code as less "prestigious" or "legit" than the old one

I was recently amazed by my wife: we got a cell phone, and when given the
choice of area code, she indicated a strong preference for the old one (now
Dallas interior -- the surrounding ring has the new one). And the area
codes changed five years ago or more.

- Peter

Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: <>

This archive was generated by hypermail 2.1.2 : Tue Jul 30 2002 - 07:51:32 EDT