RE: Encoding for Fun (was Line Separator)

Date: Wed Oct 22 2003 - 11:11:43 CST

> I can't argue with that ... but my strings were always in (32-bit wide)
> Unicode at "sort-time". I'm not sure exactly how much value there is a
> lexicographical sort anyway. I mean, even in Latin-1, surely '' should
> not come after 'z'?

Not always. In particular there's time when a dependable sort order is
required, but just what that sort order is isn't important. In those cases it
can useful that UTF-8 and UTF-32 will both do a binary sort with equivalent

> Of course, UTF-16 doesn't have the binary sort property either.

Nope, though an efficient mechanism to sort UTF-16 in the codepoint order is

