Re: lowercased Unicode language tags ? (was:ISO 15924)

From: John Cowan (
Date: Mon May 03 2004 - 06:52:54 CDT

Antoine Leca scripsit:

> > Catalan is not Spanish, and has its own code.
> Sorry to contradict you slightly, John. Please note that this issue is
> sensitive for some Catalans here in Spain, so I mention it for the sake of
> everybody here knowing it.

I'm not sure where the contradiction comes in. I am using "Spanish" as
a noun, meaning the Spanish language (i.e. "espan~ol", "castellano"),
and saying that Catalan and Spanish are distinct languages. This seems
to be what you are saying also.

> Particularly when I read
> Tags constructed wholly from the codes that are assigned
> interpretations by this chapter do not need to be registered with
> IANA before use.
> inside clause 2, which otherwise says that the 2nd subtag when 2 letter
> designates a country, and also says that 3rd and next subtags do not have
> semantical restrictions.

All tags need to be registered in the RFC 3066 regime, except those of
the following forms: xx, xxx, xx-yy, xxx-yy, where xx is an ISO 639-1
code, xxx is an ISO 639-2 code (for a language that does not have an
ISO 639-1 code), and yy is an ISO 3166-1 code. When a code from any
other source is used, including RFC 3166-2, registration is required.
Thus en-us can be used even though it is not registered, but en-us-ma
(Massachusetts English) would require registration.

Catalan is a language that doesn't segment nicely on national boundaries,
so plain "ca" is the right tag for it in general, although "ca-it" might
be useful for Algherese dialect.

If you have ever wondered if you are in hell,         John Cowan
it has been said, then you are on a well-traveled
road of spiritual inquiry.  If you are absolutely
sure you are in hell, however, then you must be
on the Cross Bronx Expressway.          --Alan Feuer, NYTimes, 2002-09-20

This archive was generated by hypermail 2.1.5 : Fri May 07 2004 - 18:45:25 CDT