We have gotten the following feedback on a number of Eastern European languages. Please look them over and if there is any reason not to do them, please let us know as soon as possible.
Remember that the ICU rules (which correspond to CLDR syntax) only list the differences from http://www.unicode.org/charts/collation/. For more on data formats, see http://www.unicode.org/cldr/data_formats.html#Collation.
| CLDR 1.0 | Suggested Change | Comments | ||||||||||
|
|
|||||||||||
| & C < č <<< Č < ć <<< Ć & Đ < dž <<< Dž <<< DŽ & L < lj <<< Lj <<< LJ & N < nj <<< Nj <<< NJ & S < š <<< Š & Z < ž <<< Ž |
& C < č <<< Č < ć <<< Ć & D < dž <<< Dž <<< DŽ & L < lj <<< Lj <<< LJ & N < nj <<< Nj <<< NJ & S < š <<< Š & Z < ž <<< Ž |
1. Changing to D will put dž ahead of Đ instead of behind it. |
| CLDR 1.0 | Suggested Change | Comments | ||||||||||||||||
|
|
|||||||||||||||||
| & A < ă <<< Ă & D < đ <<< Đ & I < î <<< Î & S < ş <<< Ş & Þ < ţ <<< Ţ & Z < ż <<< Ż |
& A < â <<< Â < ă <<< Ă & D < đ <<< Đ & I < î <<< Î & S < ş <<< Ş & T < ţ <<< Ţ & Z < ż <<< Ż |
1. Changing the order of ă and â. Note: this change will cause â to be treated as a separate letter from a on a primary level, thus producing the following ordering.
Can we verify that that is what is desired? If so, then we also probably need to add â to the exemplar characters. 2. Changing ţ to be after T, not after Z |
| CLDR 1.0 | Suggested Change | Comments | ||||||||||
|
|
|||||||||||
| & A < ą <<< Ą & C < ć <<< Ć & E < ę <<< Ę & L < ł <<< Ł & N < ń <<< Ń & O < ó <<< Ó & S < ś <<< Ś & Z < ź <<< Ź < ż <<< Ż |
& A < ą <<< Ą & C < ć <<< Ć & E < ę <<< Ę & L < ł <<< Ł & N < ń <<< Ń & O < ó <<< Ó & S < ś <<< Ś & Z < ż <<< Ż < ź <<< Ź |
1. Change the order of ź and ż |
| CLDR 1.0 | Suggested Change | Comments | ||||||||||
|
|
|||||||||||
| & C < č <<< Č < ć <<< Ć & Đ < dž <<< Dž <<< DŽ & L < lj <<< Lj <<< LJ & N < nj <<< Nj <<< NJ & S < š <<< Š & Z < ž <<< Ž |
& C < č <<< Č < ć <<< Ć & D < dž <<< Dž <<< DŽ & L < lj <<< Lj <<< LJ & N < nj <<< Nj <<< NJ & S < š <<< Š & Z < ž <<< Ž |
1. Changing to D will put dž ahead of Đ instead of behind it. (Same as Hr) |
| CLDR 1.0 | Suggested Change | Comments | ||||||||||
|
|
|||||||||||
| & C < č <<< Č & S < š <<< Š & Z < ž <<< Ž |
& C < č <<< Č < ć <<< Ć & S < š <<< Š & Z < ž <<< Ž |
1. Add ć
2. In the UCA, đ should already sort after d (primary difference), so this we should take as a request to add as an exemplar character.
|
The above changes were made on the basis of an Excel Chart that was supplied to us. We have one concern, that it does not accurately list all of the collation rules in ICU. For example, look at Hungarian. The list on the left below is from the Excel spreadsheet. The ICU rules show other combinations that are not listed in the chart, such as DZS, CCS, etc.; sequences which, according to the information we have, behave as contractions in sorting.
(Note also that the ICU rules also explicitly list the strength of the differences also.).
| Excel | ICU | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
& C < cs <<< Cs <<< CS & D < dz <<< Dz <<< DZ & DZ < dzs <<< Dzs <<< DZS & G < gy <<< Gy <<< GY & L < ly <<< Ly <<< LY & N < ny <<< Ny <<< NY & S < sz <<< Sz <<< SZ & T < ty <<< Ty <<< TY & Z < zs <<< Zs <<< ZS & O < ö <<< Ö << ő <<< Ő & U < ü <<< Ü << ű <<< Ű & cs <<< ccs / cs & Cs <<< Ccs / cs & CS <<< CCS / CS & dz <<< ddz / dz & Dz <<< Ddz / dz & DZ <<< DDZ / DZ & dzs <<< ddzs / dzs & Dzs <<< Ddzs / dzs & DZS <<< DDZS / DZS & gy <<< ggy / gy & Gy <<< Ggy / gy & GY <<< GGY / GY & ly <<< lly / ly & Ly <<< Lly / ly & LY <<< LLY / LY & ny <<< nny / ny & Ny <<< Nny / ny & NY <<< NNY / NY & sz <<< ssz / sz & Sz <<< Ssz / sz & SZ <<< SSZ / SZ & ty <<< tty / ty & Ty <<< Tty / ty & TY <<< TTY / TY & zs <<< zzs / zs & Zs <<< Zzs / zs & ZS <<< ZZS / ZS |