Re: lowercased Unicode language tags ? (was:ISO 15924)

From: Doug Ewell (
Date: Mon May 03 2004 - 10:37:52 CDT

Christopher Vance <vance at aurema dot com> wrote:

> On Mon, May 03, 2004 at 10:17:02AM +0200, Philippe Verdy wrote:
>> Oops, I searched an example, and forgot to change the leading code.
>> This should have been read as: "ca-Latn-ESCI" or "ca-ESCI".
>> It's not importnat here, it was only an arbitrary example to show
>> that the syntax in RFC 3066 may become ambiguous to parse.
>> Someone says that this should be "ca-Latn-ES-CI" or "ca-ES-CI"
>> without this problem. But isn't there (sub-country) region codes with
>> 4 letters?
> Even if there were (my draft is at work and I can't check right now),
> there's no problem. Script names are mixed case, and ISO 3166-2
> subcountry codes are all upper.

(1) Philippe's point was, what if everything is lowercased, as
recommended for Plane 14 language tags?

(2) No, there are no ISO 3166-2 codes with 4 characters.

(3) Even if there were, they wouldn't cause ambiguities in parsing RFC
3066 tags. See my other post.

-Doug Ewell
 Fullerton, California

This archive was generated by hypermail 2.1.5 : Fri May 07 2004 - 18:45:25 CDT