Re: UTF-8 codification

From: Kenneth Whistler (kenw@sybase.com)
Date: Mon Jun 05 2000 - 20:24:59 EDT


Digging around in older email, I saw Doug Ewell say:

> "Daniel CAUNE" <d.caune@citb.bull.net> wrote:
>
> > Where can I find a white paper about UTF-8 codification ? Is there a
> > such document on the Unicode Organisation Web site ?
>
> You know, I've been meaning to mention that. There is no definition of
> UTF-8 anywhere on the Unicode Web site,

http://www.unicode.org/glossary/

Defines the *term* UTF-8, although it doesn't define the bit mapping for
the encoding form.

And Doug is correct that we don't currently have a summary discussion
of how UTF-8 works. This is an oversight that ought to be addressed
on the website.

> except for incidental references
> in Technical Reports 16, 17, 18, and 22, and none in the Unicode 3.0
> book,

See Definition D36, page 47, Section 3.8 Transformations, in the Unicode
Standard, Version 3.0. Accompanying that definition is Table 3-1, which
shows the UTF-8 bit distribution in detail.

> except for a pointer to the sample implementation on the CD-ROM.
>
> If UTF-8-encoded Unicode is going to become the worldwide standard we
> want it to be, it should really be easier to find the UTF-8 algorithm
> on the Unicode Web site. Questions like Daniel's are going to come up
> again and again until it is.
>

I agree.

--Ken



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:03 EDT