The following are recommended changes to the Chinese collators.
Here is the list of collators that are currently in CLDR 1.0. The names are a bit different (see #1 below), but there are 4 different distinct orderings for Chinese.
The zh_Hant@collation=stroke and zh@collation=pinyin are based on data supplied some time ago from Lotus
1. Add aliases so that:
2. Fix the Latin Pinyin ordering in all of the Chinese collators (including the new ones below) according to GB/T 13418-92, Section 5.1.1 Chinese Pinyin Collation Rules: (http://www.moe.edu.cn/moe-dept/yuxin/content/gfbz/scanning/gfhbz/gfbz18.htm)
| CLDR 1.0 | Proposed CLDR 1.1 |
| & ̄ << ́ << ̌ << ̀ << ̈ |
&[before 2] a << ā <<< Ā << á <<< Á << ǎ <<< Ǎ << à <<< À &[before 2] e << ē <<< Ē << é <<< É << ě <<< Ě << è <<< È &[before 2] i << ī <<< Ī << í <<< Í << ǐ <<< Ǐ << ì <<< Ì &[before 2] o << ō <<< Ō << ó <<< Ó << ǒ <<< Ǒ << ò <<< Ò &[before 2] u << ū <<< Ū << ú <<< Ú << ǔ <<< Ǔ << ù <<< Ù & U << ǖ <<< Ǖ << ǘ <<< Ǘ << ǚ <<< Ǚ << ǜ <<< Ǜ << ü |
Notes:
3. The zh and zh_Hant use legacy code point order for Han characters (GB 2312 and Big5 respectively); no change to that planned. The zh_Hant@collation=stroke will also be left alone.
4. A new zh@collation=stroke will be introduced based on GF 3003 - 1999 GB13000.1 Character set Hanzi Stroke Order. (http://www.moe.edu.cn/moe-dept/yuxin/content/gfbz/scanning/zfjhzzx/gfbz30.htm)
5. zh@collation=pinyin will be regenerated in the following way.
6. Note: we will be using the same pinyin file to update the ICU pinyin transliterator.