Re: Understanding UTF-8 and ISO correlation's.

From: Kenneth Whistler (kenw@sybase.com)
Date: Tue May 25 1999 - 15:06:19 EDT


To supplement Andrea Vine's answer to James Liptak's question:

>
> I am James Liptak and trying to understand the correlation between UTF-8 and
> ISO standards. From what I can see we have ISO -10646 is the mapping but
> then goes on to saying ISO-8859 -1 (Latin- 1) not the Extended Latin
> 8859 -2.
>
> My question is: Does UTF-8 use all of ISO - 8859 -(1-9) or is it specific
> and only handles specific parts of ISO - 8859 which according the
> publication V2.0 of Unicode standard is vague.

UTF-8 is an encoding form of the Unicode Standard (and of ISO/IEC 10646).
As such, it can be used to represent *any* Unicode character.

The repertoire of encoded characters in the Unicode Standard includes
*all* of the characters covered by ISO/IEC 8859, Parts 1 - 9 (and also
Parts 10, 13, 14, and 15 as well). So if you use UTF-8, you can represent
any of the characters included in any of the published 8859 series of
standards.

--Ken Whistler, Technical Director, Unicode, Inc.



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:46 EDT