RE: U+xxxx, U-xxxxxx, and the basics

From: Marco.Cimarosti@icl.com
Date: Wed Mar 08 2000 - 10:59:40 EST


Keld Jørn Simonsen wrote, responding to me:
>>I understood that Unicode had extended beyond the
>>0x0..0xFFFF range. The fact that no code point
>>is assigned yet in the 0x10000..0x10FFFF range
>>does not mean that these code points don't exist.
>
> Yes, but my last reading was that surrogates are characters.
> Maybe it was changed with 3.0

Uhm... Probably they are: the meaning of "character" is every day more
vague.

But this brings another question: what is the role of surrogates if I am
using 32-bit units?

Consider this *UCS-4* fragment:

        ... U-00D8000 U-00DC00 ...

What kind of animal would that be!? An (absurd) sequence of two characters
or an alternative spelling for U-010000?

Ciao. Marco



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:59 EDT