Re: Understanding UTF-8 and ISO correlation's.

From: G. Adam Stanislav (adam@whizkidtech.net)
Date: Tue May 25 1999 - 22:17:52 EDT


At 09:52 25-05-1999 -0700, James Liptak wrote:
>I am James Liptak and trying to understand the correlation between UTF-8 and
>ISO standards. From what I can see we have ISO -10646 is the mapping but
>then goes on to saying ISO-8859 -1 (Latin- 1) not the Extended Latin
>8859 -2.

UTF-8 has nothing to do with ISO-8859. It is an encoding of 16-bit Unicode
(or 31-bit ISO-10646) into a sequence of 8-bit bytes.

That is not to say you cannot UTF-8 encode ISO-8859 text, you just need to
convert it to Unicode first. I have written a series of utilities that do
the various types of conversions between ISO 8859 (or any other 8-bit
mapping), Unicode, and UTF-8. They can go in any direction.

If you wish, you're more than welcome to them. They are at
http://www.whizkidtech.net/i18n/ . Or, if your OS happens to be FreeBSD,
just type "cd /usr/ports/converters/i18ntools;make install".

Adam



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:46 EDT