Re: CLDR and ICU

From: Mark Davis ☕ <>
Date: Wed, 25 Jul 2012 17:55:21 -0700

Mark <>
*— Il meglio è l’inimico del bene —*

On Wed, Jul 25, 2012 at 5:01 PM, Richard Wordingham <> wrote:

> What is the formal relationship between the Common Locale Data
> Repository (CLDR) and International Components for Unicode (ICU)?

ICU is one of the main clients for CLDR data. Because it makes extensive
use of the data, the CLDR group also uses ICU for testing.

> I ask for two reasons:
> I raised a ticket on a
> proposed clarificatory addition to UTS#35 'Locale Data Markup
> Language', and it has just been closed as a duplicate of an ICU issue.
> As no-one disputes that the problem is an issue relating to LDML, this
> seems bizarre.

It was not closed as "a duplicate of an ICU issue". It was closed as a
"duplicate". You jumped to the conclusion that it was a duplicate of an ICU

The reason it was marked as a duplicate is that there had been changes in
the working draft such that the committee believed that the problems cited
in your report had been taken care of. For example, your ticket complains
about "[0.0.c.t]", but if you look at the working draft (be sure to refresh
your browser; sometimes an old version can hang around for a while), there
is no such text.

If there are still issues that you feel have not been resolved, the ticket
can be reopened with specific comments as to what was not addressed, or you
can open a new ticket for just the remaining items.

> The ICU implementation of collation tailoring for changed ordering is
> bizarre in some complicated cases. (Life can be complicated.) Should
> UTS#35 be documenting what ICU does,

or should Unicode be saying what
> ICU should do when implementing a tailoring expressed in LDML?

This is a false dichotomy.

The goal for collation is to balance user expectations in terms of
functionality, feasibility, performance, and size. The CLDR committee
certainly takes into account how implementations can use CLDR data; it
would be of little good to have data that required implementations to be
overly bulky or complicated or slow. There will, however, always be room
for improvement.

In many cases there is a change in LDML or CLDR data where ICU and other
clients have to catch up to it; in many cases implementation experience in
ICU (or Windows, or iOS, or...) leads to a proposal for how to handle
something in LDML or CLDR data. In some cases ICU or other clients may have
their own tailorings on top of CLDR; and for that matter, many companies
(such as my company, Google) apply some patches on top of CLDR data.

The same is true for many other Unicode standards and data. The
implementations inform the standard, and are also adapting to changes in it.

> Richard.
Received on Wed Jul 25 2012 - 19:57:13 CDT

This archive was generated by hypermail 2.2.0 : Wed Jul 25 2012 - 19:57:14 CDT