RE: Locale ID's again: simplified vs. traditional

From: Carl W. Brown (
Date: Wed Oct 04 2000 - 13:13:16 EDT


Ah the poor Irish. It looks like they got the short end of the stick. At
first I wondered myself why script would have anything to do with language
but then I thought of Latin vs. Gaelic Latin. In a more recent discussion
the topic of Cyrillic vs. Old Slavonic Cyrillic was discussed. The language
and country are the same but the script is different. I am sure that you
know far more that I do, but if I am using the Gaelic Latin script are there
differences such as collating sequences?

In another example Aziri (Cyrillic) and Aziri (Latin) you have no problem.
In this case you would apply such things as the Turkish dotted and dotless i
rules for case conversion. If the characters are Latin it works and in
Cyrillic you don't care. But in non-overlapping sets you can have problems.

If I remember aren't there differences between GB & Big5 sort orders? If so
any collation routine would have to know the script making script a locale
issue not just a font issue. I also seem to recall that Big5 had more
characters that GB and it is not exactly a 1:1 reversible conversion.


Ar 13:55 -0800 2000-10-03, scríobh

>- The use of ISO 15924 for "sub-language specifications" has been removed
>from the draft for the successor to RFC-1766 because there was no consensus
>that the meaning and usage of these was clear.

I don't thank this means they are forbidden in tags, though. It just means
we weren't sure they were universally applicable enough to be specified in
the RFC.

