Re: Character sets.....

From: Ken Lunde (lunde@adobe.com)
Date: Fri Aug 28 1998 - 10:41:56 EDT


Anupam,

You wrote:

>> Could anyone help me to find out the Character sets for
>> UTF-8, SJIS, JIS, EUC-JP and Unicode 2.0

You're confusing character sets with encodings. In the West (before
Unicode), the relationship is not so clear. But in the CJKV context,
the distinction is necessary.

UTF-8 is an encoding (or transformation) for Unicode 2.0 (now
2.1). Shift-JIS is an encoding that encapsulates the JIS X 0201-1997
(JIS-Roman and half-width katakana) and JIS X 0208:1997 character
sets. JIS (now called ISO-2022-JP) supports only JIS-Roman from JIS X
0201-1997 plus JIS X 0208:1997. EUC-JP supports the same character
sets as Shift-JIS, plus JIS X 0212-1990. Unicode 2.0 is a character
set name (it is now Version 2.1).

-- Ken



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:41 EDT