Re: UTF-5 specification (and UTF-7)

From: Kenneth Whistler (
Date: Fri Mar 03 2000 - 14:15:19 EST

Doug asked:

> On the side, Ken's explanation of the difference between character
> encoding schemes and transfer encoding schemes was authoritative and
> interesting as always, but it left me (oh dear) confused again: How is
> UTF-5 different from UTF-7 in this regard? Ken wrote:
> > TES's are things like base64, uuencode, BinHex, quoted-printable, etc.,
> > that are designed to convert textual (or other) data into sequences of
> > byte values that avoid particular values that would confuse one or more
> > Internet or other transmission/storage protocols.
> Gosh, that sounds like UTF-7 -- avoiding certain byte values that may not
> be permissible in RFC 822 e-mail. What's the difference? Is UTF-7 not
> a true UTF either by this definition?

Correct. UTF-7 should also be considered a TES. It, too, is unfortunately

Note that "UTF-7" is not mentioned in the Unicode Standard, Version 3.0
under the section on Transformations (p. 45 ff.) nor in the discussion of
encoding forms in Chapter 2 (p. 19 ff.). Essentially "UTF-7" is disavowed
by the Unicode Standard. It *is* defined in the glossary, which refers
to RFC-2152, but frankly, most of us wish it would just go away. ;-)


This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:59 EDT