Re: unicode UNIX editor recommendations

From: Jungshik Shin (jshin@pantheon.yale.edu)
Date: Thu Feb 19 1998 - 20:56:01 EST


On Wed, 18 Feb 1998, Chong Chiah Jen wrote:

> : From: Jungshik Shin <jshin@pantheon.yale.edu>

  Thank you for the info.

> There is a multilingual support product named xMASS running on UNIX, X
> Windows which can give you the Unicode support environment. You may
> find the detailed information from:
> http://www.starglobe.com.sg/cgi-bin/multi/products/home.html

> Japanese SHIFT-JIS, EUC-JIS,ISO 2022
> Korean KSC

  Please, change the name of the 8bit **encoding**(for two coded
character sets KS C 5601 and US-ASCII/KS C 5636/ISO 646) most widely
used in Korea to EUC-KR. "KS C" is not the name of the encoding commonly
used in Korea, but it's the name of the section of Korean Industrial
Standard for information exchange in which KS C 5601(2byte coded
character set of 94x94 characters), KS C 5636(1byte coded character set
of 94 characters, a local version of ISO-646/US-ASCII), KS C
5700(Unicode 2.0/ISO-10646 equivalent) and others are defined.

  The same is true of EUC-JIS, It should NOT be refered to as EUC-JIS
but should be called EUC-JP which encodes JIS X 0201, JIS X 0208, JIS X
0212 and half-width katakana. Even if it's in Japanese section, ISO-2022
(aside from that it's the name of the int'l standard that predated ISO
10646 and Unicode 2.0) is too generic name, I'd rather specifically call
it ISO-2022-JP. Please, note that there are a few other encodings that
have ISO-2022 in their names, ISO-2022-KR(RFC 1557), ISO-2022-CN(RFC
1922) and ISO-2022-CN-Ext(RFC 1922). Moreover, it's not clear which you
meant even if only Japanese is considered because there are two
other encodings for Japanese(and CK in case of ISO-2022-JP-2), namely
ISO-2022-JP-2(RFC 1554), ISO-2022-JP-1(RFC 2237).

  I wish any company interested in CJK support pays attention to the
distinction between coded character set and encoding for one or more
coded character sets. It is not clear for simple 1byte
encodings/character sets like ISO-8859-x and US-ASCII, but they
shouldn't be mixed up with each other when it comes to CJK. RFC 2130
sums up this issue nicely.

    Once again, thank you for the information,

      Jungshik Shin



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:39 EDT