From: Mark Davis (firstname.lastname@example.org)
Date: Sat Jan 26 2008 - 13:44:58 CST
1. your remarks should go to the cldr-users group (you could also cc
them here if you want, but they should definitely go there).
2. if you are referring to a CLDR bug, give the number. When I search
for "transliteration" I only see 5 open bugs, none of which appear to be
what you are talking about.
On Jan 19, 2008 3:48 PM, Philippe Verdy <email@example.com> wrote:
> > De : firstname.lastname@example.org [mailto:email@example.com] De
> > part de Rick McGowan
> > Envoyé : samedi 19 janvier 2008 17:58
> > À : firstname.lastname@example.org
> > Objet : Unicode Transliteration Guidelines released
> > The Unicode CLDR committee has released
> > "Unicode Transliteration Guidelines":
> > http://www.unicode.org/cldr/transliteration_guidelines.html
> Note the following text:
> Even within particular languages, there can be variants according
> different authorities, or even varying across time (if the
> changes its recommendation). The canonical identifier that CLDR
> for these has the form:
> The source (and target) can be a language or script, either using
> English name or a locale code. The variant should specify the
> authority, and if necessary, the year. For example, the identifier
> the Russian to Latin transliteration according to the UNGEGN would
> ru-und_Latn/UNGEGN, or
> This description has a CLDR bug associated with it since quite long about
> the format of the identifier. And proposed changes, plus comments,
> suggesting that the use of '-' and '_' is not coherent with existing
> practices with locale identifiers where they are treated equivalently.
> Also the placement of the variant is ambiguous if the transliteration is
> This bug was accepted by a CLDR comity member but delayed for later
> resolution. Apparently it is still in this status, and has been forgotten.
> I have recently proposed a solution using another format, based on pure
> locale ids (because transliteration variants are effectively creating
> variants by defining an alternate orthography for the associated
> And forgetting the support for languages using full names like:
> (because most of these names are not part of the CLDR Root locale and
> English names for languages are often ambiguous or could create havoc with
> some language names that must include the separators needed for parsing)
> The format should then become simply:
> where both locale ids are adhering to the RFC definition.
> (Note that I suggest treating "." and "/" equivalently for the separator
> between the two locales, like we should treat "_" and "-" equivalently as
> tag separators within the locale id; this makes the format compatible with
> existing locale id parsers, resource bundle parsers or resolvers where "/"
> could cause problems with filesystems).
This archive was generated by hypermail 2.1.5 : Sat Jan 26 2008 - 13:47:08 CST