Re: UTF-5 specification (and UTF-7)

From: Kenneth Whistler (kenw@sybase.com)
Date: Fri Mar 03 2000 - 14:15:19 EST


Doug asked:

>
> On the side, Ken's explanation of the difference between character
> encoding schemes and transfer encoding schemes was authoritative and
> interesting as always, but it left me (oh dear) confused again: How is
> UTF-5 different from UTF-7 in this regard? Ken wrote:
>
> > TES's are things like base64, uuencode, BinHex, quoted-printable, etc.,
> > that are designed to convert textual (or other) data into sequences of
> > byte values that avoid particular values that would confuse one or more
> > Internet or other transmission/storage protocols.
>
> Gosh, that sounds like UTF-7 -- avoiding certain byte values that may not
> be permissible in RFC 822 e-mail. What's the difference? Is UTF-7 not
> a true UTF either by this definition?
>

Correct. UTF-7 should also be considered a TES. It, too, is unfortunately
named.

Note that "UTF-7" is not mentioned in the Unicode Standard, Version 3.0
under the section on Transformations (p. 45 ff.) nor in the discussion of
encoding forms in Chapter 2 (p. 19 ff.). Essentially "UTF-7" is disavowed
by the Unicode Standard. It *is* defined in the glossary, which refers
to RFC-2152, but frankly, most of us wish it would just go away. ;-)

--Ken



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:59 EDT