Adam,
It is probably best to speak of UTF-32 which has replaced UCS-4 just like
UTF-16 has replaced UCS-2.  The only Unicode encoding that uses surrogates
is UTF-16.  UTF-32 uses scalar values not surrogates.  The surrogate code
points are not valid UTF-32 code points.  There are no UTF-16 or UTF-8
characters that will convert to this range of UTF-32 values just like the
values above 0x0010FFFF are not valid Unicode code points.  Likewise UTF-8
values from EDA080 (U+D800) to EDBFBF (U+DFFF) and above F480BFBF (U+10FFFF)
are not valid code points.
Carl
> -----Original Message-----
> From: unicode-bounce@unicode.org
> [mailto:unicode-bounce@unicode.org]On Behalf Of Adam Twardoch
> Sent: Saturday, August 25, 2001 11:34 PM
> To: Marcin 'Qrczak' Kowalczyk; unicode@unicode.org
> Subject: Re: Nonsense in
> http://www.unicode.org/Public/PROGRAMS/CVTUTF/CVTUTF.C?
>
>
> ----- Original Message -----
> From: "Marcin 'Qrczak' Kowalczyk" <qrczak@knm.org.pl>
> > I don't understand. I'm talking about characters above U+FFFF, not
> > about characters from the range U+D800..DFFF. They are represented
> > as themselves in UCS-4. But the said routine represents them as pairs
> > of surrogates.
>
> So my question for clarification:
>
> Does UCS-4 use scalar values or surrogate pairs to represent codes form
> outside of BMP?
>
> Adam
>
>
This archive was generated by hypermail 2.1.2 : Sun Aug 26 2001 - 10:03:20 EDT