RE: new version of BCP 47: language identifiers

From: Phillips, Addison (addison@amazon.com)
Date: Thu Jun 25 2009 - 11:43:17 CDT

Next message: Venugopalan G: "Zero termination"

Previous message: Mark Davis ⌛: "new version of BCP 47: language identifiers"
In reply to: Mark Davis ⌛: "new version of BCP 47: language identifiers"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Note: the registry will be updated first. It usually takes the RFC Editor awhile to get to publishing the draft, whereas the registry conversion will probably happen sometime in the next couple of weeks.

Addison

Addison Phillips
Globalization Architect -- Lab126

Internationalization is not a feature.
It is an architecture.

From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org] On Behalf Of Mark Davis ?
Sent: Thursday, June 25, 2009 8:51 AM
To: Unicode
Subject: new version of BCP 47: language identifiers

The newest version of BCP 47 for language identifiers has just been approved, after a 3 year slog! I don't know how long it will be until it is published, which will involve:

* the spec at http://www.rfc-editor.org/rfc/bcp/bcp47.txt being updated to http://tools.ietf.org/html/draft-ietf-ltru-4646bis-23 (plus editing below), and
* the registry at http://www.iana.org/assignments/language-subtag-registry being updated to http://tools.ietf.org/html/draft-ietf-ltru-4645bis
But people can start the ball rolling on various upgrades where needed. There is a simple utility on http://unicode.org/cldr/utility/languageid.jsp for going from language identifiers (language tags) to their components. We'll be also updating the next version of CLDR (see the draft at http://cldr.unicode.org/development/design-proposals/languages-to-show-for-translation).

Mark
On Thu, Jun 25, 2009 at 07:16, The IESG <iesg-secretary@ietf.org<mailto:iesg-secretary@ietf.org>> wrote:
The IESG has approved the following document:

- 'Tags for Identifying Languages '
  <draft-ietf-ltru-4646bis-23.txt> as a BCP

This document is the product of the Language Tag Registry Update Working
Group.

The IESG contact persons are Alexey Melnikov and Lisa Dusseault.

A URL of this Internet-Draft is:
http://www.ietf.org/internet-drafts/draft-ietf-ltru-4646bis-23.txt

Technical Summary

This document describes the structure, content, construction, and
semantics of language tags for use in cases where it is desirable to
indicate the language used in an information object. It also
describes how to register values for use in language tags and the
creation of user-defined extensions for private interchange.
This document is an update of RFC4646. The main change is the
addition of thousands of three-letter language subtags for languages
for which tagging was not possible up to now. Also, the registry
format and procedures were adjusted to deal with this change,
and to reflect experience from current practice.

Working Group Summary

The WG process for this document was mostly smooth and revolving
around details. There were some highly contentious issues, but
for all of them, a solution was found that was acceptable to
the involved parties and works for all scenarios identified.

Document Quality

The IANA Language Subtag Registry, and the language tags that can
be formed according to this document and its predecessor, are widely
used across the Internet to identify languages, both in implementations
(code) and in a wide range of data.

Personnel

Martin J. Dürst is the document shepherd. Alexey Melnikov
is the responsible AD.

RFC Editor Note

Please move the reference to RFC 2028 to the Informative section.

The document has several references to BCP 47. RFC Editor
should check if they are appropriate and how to represent them better.

There are several cases of mismatched singulars and plurals
in the document, so RFC Editor might want to check for these.

Please replace the last paragraph of section 6 with 2 paragraphs:
OLD:
  The registries specified in this document are not suitable for
  frequent or real-time access to, or retrieval, of the full registry
                                               ^
  contents. Most applications do not need registry data at all. For
  others, being able to validate or canonicalize language tags as of a
  particular registry date will be sufficient, as the registry contents
  change only occasionally. Changes are announced to
  <ietf-languages-announcements@iana.org<mailto:ietf-languages-announcements@iana.org>>. Changes, or the absence
  thereof, can also easily be detected by looking at the 'File-Date'
  record at the start of the registry, or by using features of the
  protocol used for downloading, without having to download the full
  registry.

NEW:
  The registries specified in this document are not suitable for
  frequent or real-time access to, or retrieval of, the full registry
                                               ^ ^
  contents. Most applications do not need registry data at all. For
  others, being able to validate or canonicalize language tags as of a
  particular registry date will be sufficient, as the registry contents
  change only occasionally. Changes are announced to
  <ietf-languages-announcements@iana.org<mailto:ietf-languages-announcements@iana.org>>. This mailing list is
                                           ^^^^^^^^^^^^^^^^^^^^
  intended for interested organizations and individuals, not for bulk
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  subscription to trigger automatic software updates. The size of the
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  registry makes it unsuitable for automatic software updates.
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  Implementers considering integrating the Language Subtag Registry in
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  an automatic updating scheme are strongly advised to distribute only
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  suitably encoded differences, and only via their own infrastructure,
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  not directly from IANA.
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  Changes, or the absence thereof, can also easily be detected by
  looking at the 'File-Date' record at the start of the registry, or
  by using features of the protocol used for downloading, without
  having to download the full registry. At the time of publication of
                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  this document IANA is making the Language Tag registry available
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  over HTTP 1.1. The proper way to update a local copy of the Language
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  Subtag Registry using HTTP 1.1 is to use a conditional GET [RFC2616].
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Please add RFC 2616 to the list of Informative references.

Please change Mark Davis's email address to markdavis@google.com<mailto:markdavis@google.com>.

Please insert a new section 3.9 that reads:

3.9. Applicability of the Subtag Registry

The Language Subtag Registry is the source of data elements used to
construct language tags, following rules described in this document.
Language tags are designed for indicating linguistic attributes of
various content, including not only text but also most media formats
such as video or audio. They also form the basis for language and
locale negotiation in various protocols and APIs.

The registry is therefore applicable to many applications that need some
form of language identification, with these limitations:

  - It is not designed to be the sole data source in the creation of a
language selection user interface. For example, the registry does not
contain translations for subtag descriptions or for tags composed from the
subtags. Sources for localized data based on the registry are generally
available, notably [CLDR]. Nor does the registry indicate which subtag
combinations are particularly useful or relevant.

   - It does not provide information indicating relationships between
different languages, such as might be used in a user interface to select
language tags hierarchically, regionally, or on some other organizational
model.

    - It does not supply information about potential overlap between
different language tags, as the notion of what constitutes a language is
not precise: several different language tags might be reasonable choices
for the same given piece of content.

    - It does not contain information about appropriate fallback choices
when performing language negotiation. A good fallback language might be
linguistically unrelated to the specified language. The fact that one
language is often used as a fallback language for another is usually a
result of outside factors, such as geography, history, or culture--factors
which might not apply in all cases. For example, most people who use
Breton (a Celtic language used in the Northwest of France) would probably
prefer to be served French (a Romance language) if Breton isn't available.

_______________________________________________
Ltru mailing list
Ltru@ietf.org<mailto:Ltru@ietf.org>
https://www.ietf.org/mailman/listinfo/ltru

Mark

Next message: Venugopalan G: "Zero termination"
Previous message: Mark Davis ⌛: "new version of BCP 47: language identifiers"
In reply to: Mark Davis ⌛: "new version of BCP 47: language identifiers"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Thu Jun 25 2009 - 11:47:24 CDT