/|/|ike Ayers <Mike_Ayers@bmc.com> wrote:
> BTW, I've gotten confused during this thread over the naming of
> country codes, etc. There are ISO specs, RFCs, POSIX specs (and
> more?)... Is this information conveniently summarized anywhere so
> that I may enlighten myself?
Here's a convenient, if perhaps oversimplified, summary.
The standard for two-letter language codes is ISO 639-1. There is also
an ISO 639-2 (actually, there are two variants) that specifies three-
letter language codes.
The standard for two-letter country codes is ISO 3166-1, which also
specifies collections of three-letter and numeric country codes. ISO
3166-2 specifies political subdivisions within a country.
RFC 1766 describes a way to use ISO 639-1 and 3166-1 to create language
tags for use on the Internet (e.g. in mail messages). A lowercase 639-1
language tag can be followed by a hyphen and an uppercase 3166-1 country
code to represent the concept of "language X as spoken in country Y."
Unicode Technical Report #7, "Plane 14 Characters for Language Tags,"
recommends a slight adaptation of the RFC 1766 approach (both codes are
RFC 1766 is currently being revised to allow three-letter (639-2), as
well as two-letter (639-1), language codes. This will permit the use
of language tags for hundreds of less-common languages that have no two-
letter code. The revision will also provide ways to use 3166-2 country-
subdivision codes and (draft) ISO 15924 script codes in language tags.
Naturally, the revised version will not be called RFC 1766, but will be
assigned a new number. I don't know if UTR #7 will be updated to refer
to the new RFC when it is published (I think it should be).
POSIX locale names are also formed from 639-1 language codes and 3166-1
country codes. Unlike in RFC 1766, the elements are separated by an
underscore rather than a hyphen. POSIX uses this language/country code
to represent not only the language and local dialect, but all the
attributes of a locale setting, such as decimal separator, thousands
separator, currency symbol, default date format, etc. It is widely
regarded as inadequate for covering even a reasonable subset of locale
There are other standards for language and country codes, but for our
purposes these are by far the most common.
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:13 EDT