From: Markus Scherer (email@example.com)
Date: Fri Dec 03 2004 - 14:07:23 CST
I feel the need to correct one misperception:
Lars Kristan wrote:
> 4.1 - UTF-32 is probably very useful for certain string operations.
> Changing case for example. You can do it in-place, like you could with
> ASCII. Perhaps it can even be done in UTF-8, I am not sure. But even if
> it is possible today, it is definitely not guaranteed that it will
> always remain so, so one shouldn't rely on it.
Wrong even for UTF-32. Sharp s (U+00DF) uppercases to two characters, "SS". Other examples of case
mapping expansion and contraction are in SpecialCasing.txt (one of the UCD files).
For UTF-8, there are also _simple_ (1:1) case mappings that change the length (e.g., long s [017F]
to S) while sharp s to SS happens to not change the UTF-8 string length...
PS: I wrote UTN #12 :-)
-- Opinions expressed here may not reflect my company's positions unless otherwise noted.
This archive was generated by hypermail 2.1.5 : Fri Dec 03 2004 - 14:11:46 CST