RE: How is UTF8, UTF16 and UTF32 encoded?

From: Kenneth Whistler (kenw@sybase.com)
Date: Fri May 31 2002 - 14:52:20 EDT


Rick Cameron asked:

> The Unicode Standard 2.0 had a table in Appendix A that is, I think, just
> what you're asking for. I can't find this table in the online version of TUS
> 3.0 (it's not very useful that the online index gives page numbers, when
> there's no way to map a page number to the appropriate chapter!)
>
> Does anyone know whether this table (A-3 on page A-7) is available online
> somewhere?

Table A-3 from Unicode 2.0 moved into Chapter 3 in Unicode 3.0, since
UTF-8 was itself formally incorporated into Unicode conformance at
that point. See Table 3-1 on page 47 of Unicode 3.0. (Unfortunately, access
to the table was not clearly indicated under the "UTF-8" entry in the
index to Unicode 3.0 -- an oversight that will definitely be fixed for
Unicode 4.0.) You can find it online in Chapter 3 of the online text of
Unicode 3.0 at:

http://www.unicode.org/unicode/uni2book/u2.html

The surrounding text for Table 3-1 was modified for Unicode 3.1, so you
can find the Table online again in Unicode 3.1:

http://www.unicode.org/unicode/reports/tr27/

(See Article III Conformance, in that UAX.)

And finally, Unicode 3.1 added a subsidiary table of Legal UTF-8 Byte
Sequences. That table was modified slightly for Unicode 3.2, so the
most up-to-date version online can be found in Unicode 3.2:

http://www.unicode.org/unicode/reports/tr28/

(See Article III Conformance, in that UAX.)

--Ken
 



This archive was generated by hypermail 2.1.2 : Fri May 31 2002 - 13:10:11 EDT