Re: Unicode FAQ addendum

From: John Cowan (
Date: Wed Jul 19 2000 - 15:26:27 EDT

Markus Scherer wrote:

> some of the old ones seem to be pre-unicode 1.1. should they not be updated?

No, they are 2.0.
> > 1) Unicode code units are 16 bits long; deal with it.

C1 says "A process shall interpret Unicode code values as 16-bit quantities."
"Code unit" is defined in definition D5 as a synonym for "code value".
If this needs updating, it's the Unicode folks who need to update it, not me;
I think it's still all right.

> > 4) Loose surrogates don't mean jack.
> this needs some explanation - they are illegal sequences, but should be passed through for interoperability (i think that is what the book says).

I think that behavior is "MAY" rather than "SHOULD"; the actual verb used is
"does not preclude". Anyway, this does not mean that loose surrogates
*mean* anything, only that error recovery of some sort is not forbidden.


Schlingt dreifach einen Kreis um dies! || John Cowan <> Schliesst euer Aug vor heiliger Schau, || Denn er genoss vom Honig-Tau, || Und trank die Milch vom Paradies. -- Coleridge (tr. Politzer)

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:06 EDT