Minimal Locale Requirements
We need to have a mechanism for indicating that the most important locale data in an ldml element is at a satisfactory level, and some criteria for determining that. Only items
#1 and #2, plus the yes/noExpr are for the LDML/CLDR 1.1 release. The rest could follow later.
1. Right now we can indicate that individual pieces of data are draft, but not the status of the whole. We propose adding to the DTD the following:
<!ATTLIST ldml draft ( true | false ) #IMPLIED >
2. In the normal course of events, we would expect data for new locales to be draft (and perhaps incomplete) at first, then once sufficiently vetted, the draft=true can be taken
off the top level.
We should generate a template that has every possible item in it, with every item marked draft and aliasing to root. Localizers can then take and modify this, removing the
aliases once the data has been validated; either replacing them with real data, or having it inherit from the parent.
3. The following is a proposed list of minimal data that we would expect to have in non-draft form for a locale X before we took draft off of <ldml...>. Of course, we
would encourage more; this is just the minimum level. The goal is to role this into http://www.unicode.org/cldr/data_formats.html.
- identity
- localeDisplayNames
- languages: localized names for X + English, German, French, Italian, Portuguese, Spanish, Russian, Chinese, Japanese, Korean
- scripts: localized names for none, unless the language for X is customarily written in more than one script; in which case, those script names.
- territories: localized names for G6 + BRIC (United States, United Kingdom, Germany, France, Italian, Japanese; China, India, Russia, Brazil), plus X if X is a territory
locale
- variants, keys, types: localized names for those in use in locale; e.g. translation for PHONEBOOK in a German locale.
- layout, orientation
- exemplarCharacters
- measurementSystem, paperSize
- dates
- All of the following for Gregorian, plus whatever is needed for another calendar if there is another calendar in common use in X.
- calendars: localized names for none, unless more than one calendar is in common use
- monthNames & dayNames
- context=format + width=narrow, wide, & abbreviated
- plus context=format+width=narrow, wide, & abbreviated, if required in X
- week: minDays, firstDay, weekendStart, weekendEnd (only req. for territory locales)
- am/pm/eraNames/eraAbbr
- dateFormat, timeFormat: full, long, medium, short
- timeZoneNames: localized names for "GMT", plus if country has multiple time zones as defined by Olson, then those time zone names. For each, that includes:
- generic (long, short), standard (long, short), daylight (long, short)
- exemplarCity
- numbers: symbols, decimalFormats, scientificFormats, percentFormats, currencyFormats
- currencies: localized names for G6 currencies (USD, JPY, EUR, GBP), plus the currency for X if X is a territory locale
- collation
- yesExpr, noExpr
Notes:
- The data may be absent (e.g. inherited from the parent) if such inheritance would result in correct data.
- Even if X is a territory locale, most of the data will go into the language locale that X is a descendent of. Thus if X were Spanish Guyana, then the localized name for X and
X's currency would go into the Spanish locale (unless they differ from 'standard' Spanish).