Re: Unicode 4.0 BETA available for review

From: Kenneth Whistler ([email protected])
Date: Thu Feb 27 2003 - 15:42:43 EST

Next message: Yung-Fong Tang: "Re: UTF-8 Error Handling (was: Re: Unicode 4.0 BETA available for review)"

Previous message: Yung-Fong Tang: "Re: Unicode 4.0 BETA available for review"
Maybe in reply to: Asmus Freytag: "Unicode 4.0 BETA available for review"
Next in thread: Yung-Fong Tang: "Re: Unicode 4.0 BETA available for review"
Reply: Yung-Fong Tang: "Re: Unicode 4.0 BETA available for review"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Frank Tang asked:

> >> This discussion has been centered around UTF-8. But I hope the
> >>corresponding rules apply to UTF-16 and UTF-32 for Unicode 4.0:
> >>
> >>. for UTF-32: occurrences of 'surrogates' are ill-formed.
> >>
> >>
> >>
> How about UTF-32 sequence which the 4 bytes represent value > U+10FFFF ?
> Are they considered ill-formed? Should they?

Yes, they are ill-formed.

Since all the encoding forms are based on the Unicode scalar values,
and since the Unicode scalar values are *defined* to be the
range 0x0000..0xD7FF, 0xE000..0x10FFFF, any attempt to represent
a code point higher than U+10FFFF in *any* encoding form is
ill-formed.

This will be called out explicitly in the Unicode 4.0 text, in
case anyone still has the question:

" * Any UTF-32 code unit greater than 0010FFFF<sub>16</sub> is
ill-formed."

I can keep answering these questions, but I can also assure
everyone that the UTC worked *very* hard this time around to
make the character encoding model much clearer in the Unicode 4.0
text, and to anticipate all these edge cases.

--Ken

Next message: Yung-Fong Tang: "Re: UTF-8 Error Handling (was: Re: Unicode 4.0 BETA available for review)"
Previous message: Yung-Fong Tang: "Re: Unicode 4.0 BETA available for review"
Maybe in reply to: Asmus Freytag: "Unicode 4.0 BETA available for review"
Next in thread: Yung-Fong Tang: "Re: Unicode 4.0 BETA available for review"
Reply: Yung-Fong Tang: "Re: Unicode 4.0 BETA available for review"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Thu Feb 27 2003 - 16:27:20 EST