unicode/iso-10646-1 cover the _repertoire_ of all of the iso-8859-*'s as far as
they were published until recently. i am not sure about the newest 8859 parts.
covering the repertoire means that all the characters are there - and more, many
moreover, the first 256 codes in unicode are the same numbers/values as the
codes in iso-8859-1.
naturally, this is not true for all 256 codes for any other iso-8859-*, i.e.,
some characters have different numbers.
utf-8 is one of several encodings for unicode/iso-10646. it defines how to get
31-bit codes into bytes and uses more than one byte for non-ascii characters (2
bytes for the extended characters in the iso-8859-*).
have a look at http://www.unicode.org and at http://www.dkuug.dk/jtc1/sc2/wg2/
Markus Scherer IBM RTP +1 919 486 1135 Dept. Fax +1 919 254 6430
Unicode is here! --> http://www.unicode.org/
"James Liptak" <firstname.lastname@example.org> on 99-05-25 12:52:09
To: Unicode List <email@example.com>
cc: (bcc: Markus Scherer/Raleigh/Contr/IBM)
Subject: Understanding UTF-8 and ISO correlation's.
I am James Liptak and trying to understand the correlation between UTF-8 and
ISO standards. From what I can see we have ISO -10646 is the mapping but
then goes on to saying ISO-8859 -1 (Latin- 1) not the Extended Latin
My question is: Does UTF-8 use all of ISO - 8859 -(1-9) or is it specific
and only handles specific parts of ISO - 8859 which according the
publication V2.0 of Unicode standard is vague.
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:46 EDT