Re: Shift-JIS/Unicode mapping in JAVA

From: Kazuhiro Kazama (kazama@ingrid.org)
Date: Wed May 28 2003 - 23:10:17 EDT

  • Next message: John Cowan: "Re: Not snazzy (was: New Unicode Savvy Logo)"

    From: Jane Liu <xjliu_ca@yahoo.com>
    Subject: Shift-JIS/Unicode mapping in JAVA
    Date: Wed, 28 May 2003 12:36:39 -0700 (PDT)
    Message-ID: <20030528193639.92471.qmail@web10707.mail.yahoo.com>
    > I am running a JAVA program on Japanese Windows 2000 system, looking
    > at the Unicode conversion of the following four characters from
    > Shift-JIS encoding (MS-CP932) in both JRE 1.3.1 and JRE 1.4.1, and
    > noticed some interesting changes:

    I guess that you used the charset name "Shift_JIS". Would you try to
    use "Windows-31J"?

    Two Shift-JIS variations are registed in the IANA registry:
    "Shift_JIS" and "Windows-31J". The former is for JIS X 0208 and the
    latter is for Microsoft's CP932. "Windows-31J" was proposed by one of
    Microsoft's Japanese engeneers.

    "Shift_JIS" is aliased to JIS X 0208 in JDK 1.1-1.1.7. But it is
    re-aliased to CP932 in JDK 1.1.8-J2SE 1.4 ("Windows-31J" is also
    aliased to CP932) and we found problems that we can't select the right
    character encoding in J2EE platforms or there is a mapping
    mis-matching between JDK and Xerces (Xerces has an original alias
    table to alias "Shift_JIS" to JIS X 0208).

    So we requested the following alias change and it was accepted in J2SE
    1.4.1:

    Shift_JIS -> JIS X 0208's shift-jis encoding.
    Windows-31J -> Microsoft's CP932

    See changes of J2SE 1.4.1.

    http://java.sun.com/j2se/1.4.1/changes.html#Shift-JIS

    Kazuhiro Kazama (kazama@ingrid.org) NTT Network Innovation Laboratories



    This archive was generated by hypermail 2.1.5 : Wed May 28 2003 - 23:53:53 EDT