RE: [OT] Reusing the same property (was: RE: PRI #202: Extensions to NameAliases.txt for Unicode 6.1.0)

From: Doug Ewell <doug_at_ewellic.org>
Date: Wed, 31 Aug 2011 12:09:53 -0700

Philippe Verdy <verdy underscore p at wanadoo dot fr> wrote:

>> That would be extending the use of ISO 639 beyond identification of
>> languages.
>
> Nothing is extended, there already exists private-use codes in ISO 639
> (e.g. qaa-qtz).

Extending conceptually, not architecturally. Yes, the architecture of
639 allows for private-use code elements, but they aren't supposed to be
used for things that 639 itself wouldn't be used for.

> This already allows avoiding some collisions with
> other standards. so "www" could be privately be substituted by "qaa".

Sure, if one has already made the mistake of overloading the code space,
as Very Well-Known did.

> And given that ISO 639 does not have any 1-letter code, and that ISO
> 639 is used (though only via an informative reference) in BCP 47, the
> 1-letter code could also take into account the existing 1-letter
> prefixes used in BCP 47 ("i" for legacy IANA codes, for example), so
> why not assignming the "q"prefix for such private use mechanisms, with
> less restrictions about the code format and length.

Now *that* would be an architectural change. Once you do that, you are
of course no longer following the standard, but creating a new
"superset" standard. This is kind of like the people who want to extend
UTF-16 with "super" or "hyper" surrogates to access billions of code
points: it's simply not UTF-16 any more.

BCP 47 cites ISO 639 as only an informative reference because its source
of language subtags is the IANA Language Subtag Registry. No new tags
starting with "i-" will be registered, and the syntax is not available
for private use.

> E.g. "q-www" accepted as a private alias (of "www" here, even if this
> mapping is not implied by the ISO 639 standard itself), and still
> parsable in BCP 47 where it would be followed by the same standard
> sublabels (BCP 47 would require a very minor update to parse such
> pairs of sublabels as a single language code).

Or you could actually follow BCP 47, and use "x-www" instead.

--
Doug Ewell | Thornton, Colorado, USA | RFC 5645, 4645, UTN #14
www.ewellic.org | www.facebook.com/doug.ewell | @DougEwell ­
Received on Wed Aug 31 2011 - 14:12:00 CDT

This archive was generated by hypermail 2.2.0 : Wed Aug 31 2011 - 14:12:08 CDT