Re: Case mapping errors?

From: Mark Davis (markdavis@ispchannel.com)
Date: Thu Jun 22 2000 - 09:54:32 EDT


These characters are purely coded for compatibility. Unicode does not distinguish letters by the abbreviations that they happen to be used in. There is no difference in semantics between the "g" in "go" vs. the "g" in "12g", nor between the "Å" in "Århus" vs. the "Å" in "15Å", nor -- for that matter -- the "U" in "Underwood" vs the "U" in "UTF-8".

Mark

John O'Conner wrote:

> There are 5 characters that are giving me a little discomfort because of
> their case mappings:
>
> * U+00B5 MICRO SIGN
> * U+1FBE GREEK PROSGEGRAMMENI
> * U+2126 OHM SIGN
> * U+212A KELVIN SIGN
> * U+212B ANGSTROM SIGN
>
> Each of these have case mappings...and I really don't understand why. It
> appears that all of these have no "round-trip" capability to map back
> from another case. I suppose this can be argued for a lot of mapppings.
>
> The most difficult cases are 2126, 212A, and 212B. These characters are
> "letter-like" in their glyph appearance, but it seems that their actual
> semantics are not. It seems like someone may have looked at KELVIN SIGN
> for example, decided it looked like a Latin-1 'K' and gave it the same
> lowercase mapping. Still, would you really expect to lowercase a KELVIN
> SIGN to a small 'k'. I can't imagine...but I may not be as imaginative
> as some. I have the same argument for OHM SIGN and ANGSTROM SIGN.
> Although they have case mappings, are they expected by most people? If I
> were using the OHM, ANGSTROM, or KELVIN SIGN in my work, I would be very
> surprised in a case operation changed them...maybe I would be
> disappointed or frustrated even. Are these bugs in the spec? Or do I
> just need to think about them a little differently?
>
> Best regards,
> John O'Conner



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:04 EDT