Re: U+2212 (Minus Sign) and Java's ISO-2022-JP conversion

From: Markus Scherer (markus.icu@gmail.com)
Date: Fri Apr 01 2005 - 12:47:06 CST

  • Next message: Kenneth Whistler: "Re: Use of U+203D (interrobang)"

    Charsets are a mess.
    Japanese charsets are particulary notorious, see "XML Japanese
    Profile" http://www.w3.org/TR/japanese-xml/
    ISO-2022-* are even worse than others because no one publishes
    comprehensive documentation for how they convert for these.

    Evidently, in this case the Java 1.4 and 1.5 converters are different.

    On Apr 1, 2005 12:24 AM, Katsuhiko Momoi <momoi@alumni.indiana.edu> wrote:
    > Using Java's native2ascii conversion utility -- I used the one that came
    > with SDK 1.5 for Windows, \u2212 converts to ISO-2022-JP. ...
    > ... Java fails to convert \u2212 to ISO-2022-JP. (JDK version 1.4.x.)

    > Has anyone experienced this problem? I would appreciate a workaround or
    > a solution.

    Use UTF-8. Seriously.

    markus

    -- 
    Opinions expressed here may not reflect my company's positions unless
    otherwise noted.
    


    This archive was generated by hypermail 2.1.5 : Fri Apr 01 2005 - 12:48:07 CST