Re: 8-bit text which is supposed to be UTF-8 but isn't

From: John Cowan (
Date: Mon Jan 31 2000 - 10:09:40 EST

Dan Oscarsson wrote:
> Yes, UTF-16 was done right. Unfortunately UTF-8 was done wrongly. UTF-8
> should just like UTF-16 is compatible with code in the 16-bit space,
> been compatible with the first characters of 8 bits.

How? An 8-bit code compatible with UTF-16 in its first 8 bits has
no space left to represent the other 109744 codepoints. Unlike the
16-bit codespace from 0 to FFFF, the 8-bit codespace from 0 to FF is
densely packed with characters.


Schlingt dreifach einen Kreis vom dies! || John Cowan <> Schliesst euer Aug vor heiliger Schau, || Denn er genoss vom Honig-Tau, || Und trank die Milch vom Paradies. -- Coleridge (tr. Politzer)

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:58 EDT