[Unicode]   Common Locale Data Repository Home | Site Map | Search
 

Unicode CLDR Charts

The Unicode CLDR Charts provide different ways to view the Common Locale Data Repository data. There are currently the following types of charts:

Survey Tool Charts provide a view of the CLDR data designed for collecting and vetting data during certain periods. Most proposed data (new or corrections) is entered via this tool. For the schedule for the next release, see CLDR Project.

Supplemental Data Charts show the supplemental data, data that is not part of the locale hierarchy but is still part of CLDR.

Summary Charts provide a summary view of the main locale data. Language locales (those with no territory or variant) are presented with fully resolved data; the inherited or aliased data can be hidden if desired. Other locales do not show inherited or aliased data, just the differences from the respective language locale. The English value is provided for comparison (shown as "=" if it is equal to the localized value, and n/a if not available).

By-Type Charts provide a side-by-side comparison of data from different locales for each field. For example, one can see all the locales that are left-to-right, or all the different translations of the Arabic script across languages. Data that is unconfirmed or provisional is marked by a red-italic locale ID, such as ·bn_BD·.

Transform Charts show some of the transforms in CLDR: the transliterations between different scripts. For more on transliterations, see Transliteration Guidelines.

Platform Comparison Charts provide comparisons between locale data from different sources. The Collation platform comparison charts have a separate format.

Note that some CLDR data is not yet included in the charts. For example, text segmentation is not yet included, and collation data only appears in the platform comparison charts.

For information on bug reports on the data, see CLDR Bug Reports.

The format of most of the fields in the charts will be clear from the Name and ID, such as the months of the year. The format for others, such as the date or time formats, is structured and requires more interpretation. For more information, see UTS #35: Locale Data Markup Language (LDML) and CLDR Bug Reports.

Platform Comparison Charts

In the platform comparison charts (main, collation), the files are organized by locale. In each file, the first three columns identify the data item, while the subsequent columns contain data. There may be different numbers of columns per locale, based on the available comparison data. The Common data is in the first data column, with other data sources following (where available). The latter sources are generated with public APIs.

Columns

The links within the top header cells point to the XML data files, either for this locale or for a related locale (the parent or root). Each column is marked with a color at the top. The successive rows provide a comparison of the data, with the following color coding:

  • Missing data is indicated by a white cell.
  • Data that is the same as some column to the left uses the same color (with an equals sign in the cell).
  • Data that is identical except for a case change is indicated with a dagger (†).
  • Data that is different than any column to the left has the column's normal color.

Examples from different locales:

N. ParentNode Name ID COMMON (nl_NL, nl, root) WINDOWS (nl_NL) IBMJDK (nl_NL, nl, root) SOLARIS (nl_NL) OPEN_OFFICE (nl_NL)
...
56 dayNames day sat zaterdag = = = =
...
5 dateFormat pattern full yyyy 'm.' MMMM d 'd.',EEEE yyyy 'm.' MMMM d 'd.' = yyyy MMMM dd HH:mm:ss
...
120 territories territory ZA Suid-Afrika Suid Afrika South Africa
...
79 languages language be Беларускі беларускі† беларускі† ...
...
35 currency displayName EUR ユーロ ...
...
2 characters exemplarC [a-z ą ę į ų ė ū č š ž] ...
...
723 types type phonebook Telefonbuch-Sortierregeln ...

Collation

The collation pages are separated off for easier viewing. There are only currently three comparison columns for collation.

COMMON (xml UCA) LINUX (xml Base (en_US)) WINDOWS (xml base (en))

The XML link points to the data file, while the base link points to the collation base. The base for the Common collation rules is the UCA. For the other data sources, the base is chosen to be an ordering for one of the locales sharing the same script. That permits the rules to only contain differences.

Data Collection Process

  • The platform-specific culture information used for comparison was generated programmatically by requesting through the native APIs on the platform. The results are then converted to be compliant to the "Locale Data Markup Language Specification".
  • Platform-specific locale information is grouped into a platform folder. For example, the windows/ folder contains all the locale data on Windows platform in XML format.
  • The locale information from these various sources is used to create the comparison charts outlining the differences of any data elements in a locale/culture information data repertoire.
  • The data for all but common is provided for comparison only, and is not to be viewed as authoritative or referenceable.
  • For more details on the locale data collection process, please see the CLDR process for more details.

Access to Copyright and terms of use