Re: Unicode Web pages

From: Martin J. Duerst (
Date: Thu Feb 06 1997 - 06:26:26 EST

On Wed, 5 Feb 1997, Misha Wolf wrote:

> Rob Pike wrote:

> >I believe what you're supposed
> >to say is charset=UNICODE-1-1-UTF-8.
> Both MS and NS have recently moved to "UTF-8".

Rob - Maybe you are assuming that UTF-8 is a general method to
encode 4-byte quantities. This is not the case. UTF stands
for UCS transfer (or transform or whatever) format. And
UCS is the Universal Character Set, aka UNicode/ISO 10646.
Also please note that RFC 2044 defines the "charset" tag UTF-8.
However, there is one problem in that draft (due to the slow
RFC process last year), namely that RFC 2044 is written relative
to Unicode 1.1, whereas everyone agrees that "UTF-8" indeed should
be used for Unicode 2.0 and upwards.

Regards, Martin.

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:34 EDT