Re: Private Use Surrogate Pairs

From: Asmus Freytag (
Date: Thu May 09 2002 - 03:51:10 EDT

At 09:59 PM 5/8/02 -0700, Doug Ewell wrote:
>Peter_Constable at sil dot org wrote:
> > I think Jim is asking for clarification in the text of the Standard
> > and not just in a response to him, but in case anyone isn't sure,
> > the four that are excluded are U+FFFFE, U+FFFFF, U+10FFFE and
> > U+10FFFF.
> >
> > And don't bother asking for a good reason *why* they are excluded:
> > there isn't any good reason why; they just are.
>I know it's popular to say there's no good reason for these to be
>excluded, but at least excluding the U+xxFFFE code points helps prevent
>UTF-32LE from being detected as big-endian UTF-16 with BOM:
> Big-endian UTF-16: FE FF .. ..
> U+xxFFFE in UTF-32LE: FE FF xx 00
>-Doug Ewell
> Fullerton, California

This may in fact be the reason they were all excluded, although no actual
reason was ever discovered - lost in the mists of time, or early 10646 history.


This archive was generated by hypermail 2.1.2 : Thu May 09 2002 - 04:32:55 EDT