Re: lowercased Unicode language tags ? (was:ISO 15924)

From: John Cowan (cowan@ccil.org)
Date: Sun May 02 2004 - 23:58:05 CDT


Doug Ewell scripsit:

> Neither ISO 3166-3 nor (perhaps more annoyingly) ISO 3166-2 codes are
> allowed in RFC 3066 language tags. So at least in that context, there
> is no possibility of confusing them with ISO 15924 script codes.

Actually, anything can be used in RFC 3066 if it's registered. We
already have sgn-us-ma for Martha's Vineyard (Massachusetts) Sign
Language, for instance: us-ma is an ISO 3166-2 tag.

> And again, RFC 3066 language tags don't allow for the use of these ISO
> 3166-2 region codes. I'm not quite sure why this is; I think it might
> be useful on occasion to be able to encode:
>
> es-US-CA
> es-US-FL
> es-US-NY
>
> to identify the Mexican-, Cuban, and Puerto Rican-influenced dialects of
> Spanish spoken in California, Florida, and Mexico respectively.

This can be done if you register them explicitly.

> The successor to RFC 3066 is already on its way. It will allow ISO
> 3166-1 country subtags and ISO 15924 script subtags to coexist, and be
> used in a generative way instead of by registering each combination
> (still no ISO 3166-2, though).

*If* it survives the IETF process, which is by no means certain.
Harald isn't behind it, for one thing.

-- 
John Cowan  jcowan@reutershealth.com  www.reutershealth.com  www.ccil.org/~cowan
Heckler: "Go on, Al, tell 'em all you know.  It won't take long."
Al Smith: "I'll tell 'em all we *both* know.  It won't take any longer."


This archive was generated by hypermail 2.1.5 : Fri May 07 2004 - 18:45:25 CDT