Proposed CLDR 1.1 changes for Chinese sorting

The following are recommended changes to the Chinese collators.

Here is the list of collators that are currently in CLDR 1.0. The names are a bit different (see #1 below), but there are 4 different distinct orderings for Chinese.

1. Add aliases so that:

2. Fix the Latin Pinyin ordering in all of the Chinese collators (including the new ones below) according to GB/T 13418-92, Section 5.1.1 Chinese Pinyin Collation Rules: (http://www.moe.edu.cn/moe-dept/yuxin/content/gfbz/scanning/gfhbz/gfbz18.htm)

CLDR 1.0 Proposed CLDR 1.1
& ̄
  << ́
  << ̌
  << ̀
  << ̈
&[before 2] a << ā <<< Ā << á <<< Á << ǎ <<< Ǎ << à <<< À
&[before 2] e << ē <<< Ē << é <<< É << ě <<< Ě << è <<< È
&[before 2] i << ī <<< Ī << í <<< Í << ǐ <<< Ǐ << ì <<< Ì
&[before 2] o << ō <<< Ō << ó <<< Ó << ǒ <<< Ǒ << ò <<< Ò
&[before 2] u << ū <<< Ū << ú <<< Ú << ǔ <<< Ǔ << ù <<< Ù
& U << ǖ <<< Ǖ << ǘ <<< Ǘ << ǚ <<< Ǚ << ǜ <<< Ǜ << ü

Notes:

3. The zh and zh_Hant use legacy code point order for Han characters (GB 2312 and Big5 respectively); no change to that planned. The zh_Hant@collation=stroke will also be left alone.

4. A new zh@collation=stroke will be introduced based on GF 3003 - 1999 GB13000.1 Character set Hanzi Stroke Order. (http://www.moe.edu.cn/moe-dept/yuxin/content/gfbz/scanning/zfjhzzx/gfbz30.htm)

5. zh@collation=pinyin will be regenerated in the following way.

  1. Produce a mapping from Han characters to pinyin. It is based on the first (highest frequency) value in the Unihan kHanyuPinlu field; if unavailable there, on the first (highest frequency) value in the Unihan kMandarin field. There may be overrides if the IBM Shanghai office believes there are errors. If so, a separate file of overrides will be produced and available for review.
  2. The Han characters will be sorted on the basis of the pinyin mapping (according to the sorting rules in #2 above); characters that have the same pinyin mapping will be sorted on the basis of GF 3003 order (see #4 above).

6. Note: we will be using the same pinyin file to update the ICU pinyin transliterator.