RE: The mother of all collation schemes

From: Marco.Cimarosti@icl.com
Date: Fri Jun 16 2000 - 08:45:50 EDT


rampshot@usa.net wrote:
> [...] ÿ(why couldn't I find this in uppercase?) [...]

Because the corresponding uppercase is not a character, it is two: "IJ". In
fact, "ÿ" is a ligature, optionally used in Dutch to represent the sequence
"ij". E.g. "ijs" (= ice, ice-cream) is also spelled "ÿs", and both are
capitalized as "IJs" (not a typo: "Ijs" would be a spelling error).

A similar case is the German ligature "ß", that can also be spelled "ss"
(this is just an alternate spelling in Germany, but it is mandatory in
Switzerland), and is uppercased as "SS".

These are just a few ("easy") examples of cases not handled by
case-folding-based sorting algorithms. There are much worse cases, like
letters having different uppercase forms in different languages. E.g., in
Turkish, "I" is not the uppercase of "i": it is the uppercase of a different
letter (dotless "i"), that sorts after "i" (or was it before "i"?).

_ Marco



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:04 EDT