On Wed, Mar 08, 2000 at 03:59:40PM -0000, Marco.Cimarosti@icl.com wrote:
> Keld Jørn Simonsen wrote, responding to me:
> >>I understood that Unicode had extended beyond the
> >>0x0..0xFFFF range. The fact that no code point
> >>is assigned yet in the 0x10000..0x10FFFF range
> >>does not mean that these code points don't exist.
> > Yes, but my last reading was that surrogates are characters.
> > Maybe it was changed with 3.0
> Uhm... Probably they are: the meaning of "character" is every day more
> But this brings another question: what is the role of surrogates if I am
> using 32-bit units?
> Consider this *UCS-4* fragment:
> ... U-00D8000 U-00DC00 ...
> What kind of animal would that be!? An (absurd) sequence of two characters
> or an alternative spelling for U-010000?
The specific codes for UTF-16 extension into plane 1-16
is not allowed in UCS-4 (or in UTF-8 for that matter).
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:59 EDT