Unicode CLDR Charts
The Unicode CLDR Charts provide different ways to view the
Common Locale Data Repository data. There are currently the following
types of charts:
Survey Tool Charts
provide a view of the CLDR data designed for collecting and vetting
data during certain periods. Most proposed data (new or corrections) is
entered via this tool. For the schedule for the next release, see CLDR Project.
Supplemental Data Charts
show the supplemental data, data that is not part of the locale hierarchy but is still part of CLDR.
Summary Charts
provide a summary view of the main locale data. Language locales (those
with no territory or variant) are presented with fully resolved data;
the inherited or aliased data can be hidden if desired. Other locales
do not show inherited or aliased data, just the differences from the
respective language locale. The English value is provided for
comparison (shown as "=" if it is equal to the localized value, and n/a if not available).
By-Type Charts
provide a side-by-side comparison of data from different locales for
each field. For example, one can see all the locales that are
left-to-right, or all the different translations of the Arabic script
across languages. Data that is unconfirmed or provisional is marked by
a red-italic locale ID, such as ·bn_BD·.
Transform Charts show some of the
transforms in CLDR: the transliterations between different scripts. For more on transliterations, see Transliteration Guidelines.
Platform Comparison Charts provide comparisons between locale data from different sources. The
Collation platform comparison charts have a separate format.
Note that some CLDR data is not yet included in the
charts. For example, text segmentation is not yet included, and
collation data only appears in the platform comparison charts.
For information on bug reports on the data, see CLDR Bug Reports.
The format of most of the fields in the charts will be clear
from the Name and ID, such as the months of the year. The format for
others, such as the date or time formats, is structured and requires
more interpretation. For more information, see UTS #35: Locale Data Markup Language (LDML)
and CLDR Bug Reports.
In the platform comparison charts (main,
collation),
the files are organized by locale. In each file, the first three
columns identify the data item, while the subsequent columns contain
data. There may be different numbers of columns per locale, based on
the available comparison data. The Common data is in the first data
column, with other data sources following (where available). The latter
sources are generated with public APIs.
Columns
The links within the top header cells point to the XML data
files, either for this locale or for a related locale (the parent or
root). Each column is marked with a color at the top. The successive
rows provide a comparison of the data, with the following color coding:
- Missing data is indicated by a white cell.
- Data that is the same as some column to the left uses the same color (with an equals sign in the cell).
- Data that is identical except for a case change is indicated with a dagger (†).
- Data that is different than any column to the left has the column's normal color.
Examples from different locales:
| N. |
ParentNode |
Name |
ID |
COMMON (nl_NL,
nl, root) |
WINDOWS (nl_NL) |
IBMJDK (nl_NL,
nl, root) |
SOLARIS (nl_NL) |
OPEN_OFFICE (nl_NL) |
| ... |
| 56 |
dayNames |
day |
sat |
zaterdag |
= |
= |
= |
= |
| ... |
| 5 |
dateFormat |
pattern |
full |
yyyy 'm.' MMMM d 'd.',EEEE |
yyyy 'm.' MMMM d 'd.' |
= |
yyyy MMMM dd HH:mm:ss |
|
| ... |
| 120 |
territories |
territory |
ZA |
Suid-Afrika |
Suid Afrika |
|
|
South Africa |
| ... |
| 79 |
languages |
language |
be |
Беларускі |
беларускі† |
беларускі† |
... |
| ... |
| 35 |
currency |
displayName |
EUR |
ユーロ |
... |
| ... |
| 2 |
characters |
exemplarC |
|
[a-z ą ę į ų ė ū č š ž] |
... |
| ... |
| 723 |
types |
type |
phonebook |
Telefonbuch-Sortierregeln |
... |
The collation pages are separated off for easier viewing. There
are only currently three comparison columns for collation.
The XML link points to the data file, while the base link
points to the collation base. The base for the Common collation rules
is the UCA. For the other data sources, the base is chosen to be an
ordering for one of the locales sharing the same script. That permits
the rules to only contain differences.
- The platform-specific culture information used for comparison
was generated programmatically by requesting through the native APIs on
the platform. The results are then converted to be compliant to the "Locale Data Markup Language Specification".
- Platform-specific locale information is grouped into a platform folder. For example, the windows/ folder contains all the locale data
on Windows platform in XML format.
- The locale information from these various sources is used to
create the comparison charts outlining the differences of any data
elements in a locale/culture information data repertoire.
- The data for all but
common is provided for comparison only, and is not to be viewed as authoritative
or referenceable.
- For more details on the locale data collection process, please see the CLDR process for more details.