Re: Hypersurrogates: a proposed convention for ISO 10646 -> Unicode mapping

From: Robert Brady (robert@ents.susu.soton.ac.uk)
Date: Wed Nov 17 1999 - 12:23:44 EST


On Wed, 17 Nov 1999, John Cowan wrote:

> In UCS-4/UTF-32 encoding, hypersurrogates cause a 100% growth in octet size,
> from 4 octets to 8. In UTF-8 encoding, hypersurrogates cause only a 50%
> growth, from 6 octets to 8.

I hope not! Anyone wishing to encode characters higher than U-0010ffff in
UTF-8 should do so, for the same reason that characters in planes 1
through 16 should be encoded as themselves in UTF-8, and not as a
combination of two surrogates.

-- 
Robert



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:56 EDT