Re: UnicodeData.txt is invalid, flawed, broken, corrupt and wrong

From: Aki Inoue (aki@apple.com)
Date: Sat Jun 11 2005 - 15:09:22 CDT

Next message: Theodore H. Smith: "Re: UnicodeData.txt is invalid, flawed, broken, corrupt and wrong"

Previous message: Theodore H. Smith: "UnicodeData.txt is invalid, flawed, broken, corrupt and wrong"
In reply to: Theodore H. Smith: "UnicodeData.txt is invalid, flawed, broken, corrupt and wrong"
Next in thread: Theodore H. Smith: "Re: UnicodeData.txt is invalid, flawed, broken, corrupt and wrong"
Reply: Theodore H. Smith: "Re: UnicodeData.txt is invalid, flawed, broken, corrupt and wrong"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Theodore,

According to the Unicode code chart KELVIN SIGN U212A has canonically
decomposition mapping to Latin K U004B so the Unicode database is
correct.

Note, in Normalization Form C processing, you don't map single
character canonical mappings such as KELVIN SIGN or ANGSTROM SIGN.

Aki

>
> No one from the official Unicode.org company replied to me last
> time, so I'll try again.
>
> Why is it that the entry for Kelvin (a measurement of temperature),
> has a decomposition, which is listed as a canonical decomposition,
> to the standard ASCII "K"?
>
> This decomposition is actually a compatibility decomposition.
>
> How does this cause me problems? I've written a parser for
> UnicodeData.txt. This parser will extract data for decomposition,
> and for composition also.
>
> Because Kelvin canonically decomposes to K, it follows that K
> cannonically composes to Kelvin! :o(
>
> So my composer will change a word like this: "Kitchen", into
> "(Kelvin)itchen". Which is just totally wrong. All because
> UnicodeData.txt is broken.
>
> That is what I think. But I might be wrong.
>
> Can someone from Unicode.org please confirm or deny all of this?
> That will put my mind at rest, because I need the official answer.
>
> --
> http://elfdata.com/plugin/ Industrial strength string processing,
> made easy.
>
> "All things are logical. Putting free-will in the slot for premises in
> a logical system, makes all of life both understandable, and free."
>
>
>

Next message: Theodore H. Smith: "Re: UnicodeData.txt is invalid, flawed, broken, corrupt and wrong"
Previous message: Theodore H. Smith: "UnicodeData.txt is invalid, flawed, broken, corrupt and wrong"
In reply to: Theodore H. Smith: "UnicodeData.txt is invalid, flawed, broken, corrupt and wrong"
Next in thread: Theodore H. Smith: "Re: UnicodeData.txt is invalid, flawed, broken, corrupt and wrong"
Reply: Theodore H. Smith: "Re: UnicodeData.txt is invalid, flawed, broken, corrupt and wrong"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Sat Jun 11 2005 - 15:10:53 CDT