Re: Why "UTF-5" is not a UTF

From: John Cowan (jcowan@reutershealth.com)
Date: Fri Mar 03 2000 - 15:36:16 EST


Kenneth Whistler wrote:

> "A transfer encoding syntax is a reversible transform of encoded data which
> may (or may not) include textual data represented in one or more character
> encoding schemes."
>
> TES's are things like base64, uuencode, BinHex, quoted-printable, etc., that
> are designed to convert textual (or other) data into sequences of byte
> values that avoid particular values that would confuse one or more Internet or
> other transmission/storage protocols.

But UTF-8 was originally created in order to avoid the octets 00 and 2F in the
representation of any characters other than U+0000 and U+002F, because the
Unix and Plan 9 filesystems were sensitive to those octets.

What's the difference?

-- 

Schlingt dreifach einen Kreis vom dies! || John Cowan <jcowan@reutershealth.com> Schliesst euer Aug vor heiliger Schau, || http://www.reutershealth.com Denn er genoss vom Honig-Tau, || http://www.ccil.org/~cowan Und trank die Milch vom Paradies. -- Coleridge (tr. Politzer)



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:59 EDT