Re: Shift-JIS/Unicode mapping in JAVA

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Thu May 29 2003 - 06:56:58 EDT

  • Next message: Philippe Verdy: "Re: More savvy logos"

    From: "Kazuhiro Kazama" <kazama@ingrid.org>
    > From: Jane Liu <xjliu_ca@yahoo.com>
    > Subject: Shift-JIS/Unicode mapping in JAVA
    > Date: Wed, 28 May 2003 12:36:39 -0700 (PDT)
    > Message-ID: <20030528193639.92471.qmail@web10707.mail.yahoo.com>
    > > I am running a JAVA program on Japanese Windows 2000 system, looking
    > > at the Unicode conversion of the following four characters from
    > > Shift-JIS encoding (MS-CP932) in both JRE 1.3.1 and JRE 1.4.1, and
    > > noticed some interesting changes:
    >
    > I guess that you used the charset name "Shift_JIS". Would you try to
    > use "Windows-31J"?

    I think that the canonical name of this encoding should be used, as "Windows-31J" is very uncommon.
    So it seems better to designate the encoding with "CP932", or "windows-932", which Windows and Internet Explorer also prefers (and probably many other browsers).

    It is true that MS-CP932 is NOT Shift-JIS, even if it's mostly compatible with it. It was created a long time ago as an extension of an *old* version of the JIS standard, and includes characters that have been later integrated in Shift_JIS. The current version of Shift_JIS has now more characters than the Microsoft codepage 932, but MS-CP932 also includes some characters defined in all Microsoft codepages and that are still missing from Shift_JIS and won't be added now that Shift_JIS has been deprecated by a newer version that includes support for all UniHan and Unicode/ISO10646 characters.



    This archive was generated by hypermail 2.1.5 : Thu May 29 2003 - 07:38:58 EDT