new version of BCP 47: language identifiers

From: Mark Davis ⌛ (mark@macchiato.com)
Date: Thu Jun 25 2009 - 10:51:04 CDT

  • Next message: Phillips, Addison: "RE: new version of BCP 47: language identifiers"

    The newest version of BCP 47 for language identifiers has just been
    approved, after a 3 year slog! I don't know how long it will be until it is
    published, which will involve:

       - the spec at http://www.rfc-editor.org/rfc/bcp/bcp47.txt being updated
       to http://tools.ietf.org/html/draft-ietf-ltru-4646bis-23 (plus editing
       below), and
       - the registry at
       http://www.iana.org/assignments/language-subtag-registry being updated to
       http://tools.ietf.org/html/draft-ietf-ltru-4645bis

    But people can start the ball rolling on various upgrades where needed.
    There is a simple utility on
    http://unicode.org/cldr/utility/languageid.jspfor going from language
    identifiers (language tags) to their components.
    We'll be also updating the next version of CLDR (see the draft at
    http://cldr.unicode.org/development/design-proposals/languages-to-show-for-translation
    ).

    Mark

    On Thu, Jun 25, 2009 at 07:16, The IESG <iesg-secretary@ietf.org> wrote:

    > The IESG has approved the following document:
    >
    > - 'Tags for Identifying Languages '
    > <draft-ietf-ltru-4646bis-23.txt> as a BCP
    >
    > This document is the product of the Language Tag Registry Update Working
    > Group.
    >
    > The IESG contact persons are Alexey Melnikov and Lisa Dusseault.
    >
    > A URL of this Internet-Draft is:
    > http://www.ietf.org/internet-drafts/draft-ietf-ltru-4646bis-23.txt
    >
    > Technical Summary
    >
    > This document describes the structure, content, construction, and
    > semantics of language tags for use in cases where it is desirable to
    > indicate the language used in an information object. It also
    > describes how to register values for use in language tags and the
    > creation of user-defined extensions for private interchange.
    > This document is an update of RFC4646. The main change is the
    > addition of thousands of three-letter language subtags for languages
    > for which tagging was not possible up to now. Also, the registry
    > format and procedures were adjusted to deal with this change,
    > and to reflect experience from current practice.
    >
    > Working Group Summary
    >
    > The WG process for this document was mostly smooth and revolving
    > around details. There were some highly contentious issues, but
    > for all of them, a solution was found that was acceptable to
    > the involved parties and works for all scenarios identified.
    >
    > Document Quality
    >
    > The IANA Language Subtag Registry, and the language tags that can
    > be formed according to this document and its predecessor, are widely
    > used across the Internet to identify languages, both in implementations
    > (code) and in a wide range of data.
    >
    > Personnel
    >
    > Martin J. Dürst is the document shepherd. Alexey Melnikov
    > is the responsible AD.
    >
    > RFC Editor Note
    >
    > Please move the reference to RFC 2028 to the Informative section.
    >
    > The document has several references to BCP 47. RFC Editor
    > should check if they are appropriate and how to represent them better.
    >
    > There are several cases of mismatched singulars and plurals
    > in the document, so RFC Editor might want to check for these.
    >
    > Please replace the last paragraph of section 6 with 2 paragraphs:
    > OLD:
    > The registries specified in this document are not suitable for
    > frequent or real-time access to, or retrieval, of the full registry
    > ^
    > contents. Most applications do not need registry data at all. For
    > others, being able to validate or canonicalize language tags as of a
    > particular registry date will be sufficient, as the registry contents
    > change only occasionally. Changes are announced to
    > <ietf-languages-announcements@iana.org>. Changes, or the absence
    > thereof, can also easily be detected by looking at the 'File-Date'
    > record at the start of the registry, or by using features of the
    > protocol used for downloading, without having to download the full
    > registry.
    >
    > NEW:
    > The registries specified in this document are not suitable for
    > frequent or real-time access to, or retrieval of, the full registry
    > ^ ^
    > contents. Most applications do not need registry data at all. For
    > others, being able to validate or canonicalize language tags as of a
    > particular registry date will be sufficient, as the registry contents
    > change only occasionally. Changes are announced to
    > <ietf-languages-announcements@iana.org>. This mailing list is
    > ^^^^^^^^^^^^^^^^^^^^
    > intended for interested organizations and individuals, not for bulk
    > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    > subscription to trigger automatic software updates. The size of the
    > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    > registry makes it unsuitable for automatic software updates.
    > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    > Implementers considering integrating the Language Subtag Registry in
    > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    > an automatic updating scheme are strongly advised to distribute only
    > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    > suitably encoded differences, and only via their own infrastructure,
    > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    > not directly from IANA.
    > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    >
    > Changes, or the absence thereof, can also easily be detected by
    > looking at the 'File-Date' record at the start of the registry, or
    > by using features of the protocol used for downloading, without
    > having to download the full registry. At the time of publication of
    > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    > this document IANA is making the Language Tag registry available
    > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    > over HTTP 1.1. The proper way to update a local copy of the Language
    > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    > Subtag Registry using HTTP 1.1 is to use a conditional GET [RFC2616].
    > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    >
    > Please add RFC 2616 to the list of Informative references.
    >
    > Please change Mark Davis's email address to markdavis@google.com.
    >
    > Please insert a new section 3.9 that reads:
    >
    > 3.9. Applicability of the Subtag Registry
    >
    > The Language Subtag Registry is the source of data elements used to
    > construct language tags, following rules described in this document.
    > Language tags are designed for indicating linguistic attributes of
    > various content, including not only text but also most media formats
    > such as video or audio. They also form the basis for language and
    > locale negotiation in various protocols and APIs.
    >
    > The registry is therefore applicable to many applications that need some
    > form of language identification, with these limitations:
    >
    > - It is not designed to be the sole data source in the creation of a
    > language selection user interface. For example, the registry does not
    > contain translations for subtag descriptions or for tags composed from the
    > subtags. Sources for localized data based on the registry are generally
    > available, notably [CLDR]. Nor does the registry indicate which subtag
    > combinations are particularly useful or relevant.
    >
    > - It does not provide information indicating relationships between
    > different languages, such as might be used in a user interface to select
    > language tags hierarchically, regionally, or on some other organizational
    > model.
    >
    > - It does not supply information about potential overlap between
    > different language tags, as the notion of what constitutes a language is
    > not precise: several different language tags might be reasonable choices
    > for the same given piece of content.
    >
    > - It does not contain information about appropriate fallback choices
    > when performing language negotiation. A good fallback language might be
    > linguistically unrelated to the specified language. The fact that one
    > language is often used as a fallback language for another is usually a
    > result of outside factors, such as geography, history, or culture--factors
    > which might not apply in all cases. For example, most people who use
    > Breton (a Celtic language used in the Northwest of France) would probably
    > prefer to be served French (a Romance language) if Breton isn't available.
    >
    >
    > _______________________________________________
    > Ltru mailing list
    > Ltru@ietf.org
    > https://www.ietf.org/mailman/listinfo/ltru
    >
    >

    Mark



    This archive was generated by hypermail 2.1.5 : Thu Jun 25 2009 - 10:55:26 CDT