From: Asmus Freytag (firstname.lastname@example.org)
Date: Wed Feb 26 2003 - 21:18:05 EST
Can we retitle this thread?
I'm getting actual replies to my posting of the BETA that I need to keep
track of, and the run-on discussion of UTF-8 under this title is distracting.
Thanks for your help,
At 04:56 PM 2/26/03 -0800, you wrote:
>Yung-Fong Tang wrote:
>>I see a hole here. How about UTF-8 representing a paired of surrogate
>>code point with two 3 octets sequence instead of an one octets UTF-8
>>sequence? It should be ill-formed since it is non-shortest form also,
>>right? But we really need to watch out the language used there so we
>>won't create new problem. I DO NOT want people think one 3 otects of
>>UTF-8 surrogate low or high is ill-formed but one 3 octets of UTF-8
>>surrogate high followed by a one 3 octets of UTF-8 surrogate low is legal.
>How would you infer that a pair of any ill-formed sequences is not also
>ill-formed, without any specific text allowing such?
>Remember also that such pairs of 3-byte surrogate sequences were forbidden
>at the same time CESU-8 was created.
>Opinions expressed here may not reflect my company's positions unless
This archive was generated by hypermail 2.1.5 : Wed Feb 26 2003 - 21:31:12 EST