RE: Language subtags (was: RE: [OT] Reusing the same property)

From: Doug Ewell <doug_at_ewellic.org>
Date: Thu, 01 Sep 2011 11:41:54 -0700

Philippe Verdy <verdy underscore p at wanadoo dot fr> wrote:

> Well I know that this is now going out of topic, because someone else

me

> spoke about "www" (before that I had only spoken about the general
> need for any encoding open standard that wants to be universal, to
> assign a private-use area.
>
> To which there was a desire to have a larger space than just qa..qt
> union qaa..qtz, for easier (algorithmic) mapping of local-user codes
> (both in ISO 639 and BC 47).

'qa' through 'qt' is not reserved in BCP 47, and as far as I know it is
not reserved in 639-1.

I've since discovered that ISO 639-6 has assigned more than a hundred
code elements in the range 'qaaa' through 'qtzz', and apparently no code
elements marked "private-use" or "user-defined," so it looks like the
current repertoire of 520 reserved code elements across all parts of ISO
639 will remain as is.

> I had also wanted to show that the "x-" prefix in BC47 makes the
> language tag not parsable like generic structured tags (that are also
> extensible to support locale tags, using extensions such as the one
> using the "u" subtag defined by Unicode, mostly for the CLDR, e.g. to
> encode collation options, or other locale conventions). Using the
> BCP47 "x-" prefix does not permit those extensions, because "x-" BCP47
> language tags have no structure.

We know that.

> And that's why I spoke about two alternatives :
> - using another singleton letter "q" (yes in BCP 47 only), followed by
> one subtag, to create arbitrary local-use language tags, that would
> still remain be parsable and would support the "u" extension mechanism
> - using ranges of codes starting by qa..qt of arbitrary longer lengths
> (not limited to 2-3 letters as now), which means a change both in ISO
> 639 (for code allocation) and BCP 47 (to restrict 5 to 8-letter codes
> that are NOT freely usable for local-use, but still open for
> registration, so that the IANA registrey could accept a registration
> of 5-8 letters codes starting by qa..qt)

I don't see the advantage of either of these mechanisms, compared to
using the existing range 'qaa' through 'qtz'. Is there a need for more
than 520 private-use language identifiers?

> I hope this summary correctly represent what I wanted to show, because
> once agin the intent has been misunderstood and some people on this
> list were assuming things that I did not intend to request.
>
> In fact I have not requested anything, just spoken about the existing
> possibilities, that would permit an application to use a cumfortable
> space for its local uses that can easily remap some unusable codes to
> a PUA space where it can create aliases that would be recognized
> automatically by this local application as such (i.e. an alias of the
> standard language code), easing the interoperability of this
> application with the rest of the world, even if it needs to use
> local-use codes.

I don't think I said you were requesting anything. I clarified some
details about BCP 47 and ISO 639 code allocation, and addressed some
statements that others might misunderstand, such as that ISO 639 code
elements could be regarded as "prefixes" of others.

--
Doug Ewell | Thornton, Colorado, USA | RFC 5645, 4645, UTN #14
www.ewellic.org | www.facebook.com/doug.ewell | @DougEwell ­
Received on Thu Sep 01 2011 - 13:43:49 CDT

This archive was generated by hypermail 2.2.0 : Thu Sep 01 2011 - 13:43:50 CDT