Re: Shift-JIS conversion.

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Thu Nov 25 2004 - 12:08:47 CST

  • Next message: Peter Kirk: "Re: No Invisible Character - NBSP at the start of a word"

    You just need a mapping table from Unicode codepoints to Shift-JIS code positions, and a very simple code point parser to translate UTF-8 into Unicode code points.
    You'll find a mapping table in the Unicode UCD, on its FTP server. The UTF-8 form is fully documented in the Conformance section of the Unicode standard and requires no table to convert UTF-8 to 21-bit Unicode codepoints.

    There are existing tools that perform that for you, because they integrate both:

    - Java (international edition) has a Shift-JIS mapping to Unicode which is reversible. It is used with the Charset support in java.io.* and java.nio.* packages and classes. You can even use the prebuilt tool native2ascii (from the Java SDK) to do that:

        native2ascii -encoding UTF-8 < filename.UTF-8.txt
           | native2ascii -reverse -encoding SHIFT-JIS > filename.SHIFT-JIS.txt

    - GNU recode on Linux/Unix may do that for you too.

    - the Open-Sourced ICU offered by IBM has an API and support mappings for lots of charsets.

      ----- Original Message -----
      From: pragati
      To: unicode@unicode.org
      Sent: Thursday, November 25, 2004 6:00 AM
      Subject: Shift-JIS conversion.

      Hello,

        Can anyone please tell me how to convert from UTF-8 to shift-JIS?
      Please let me know if there is any formula to do it other than using readymade functions as provided by pearl. Because these functions do not provide mapping for all characters.

      Warm Regards,
      Pragati Desai.

      Cybage Software Private Ltd.
      ph(0)- 020-4044700
      Extn: 302
      mailto: pragatid@cybage.com

       



    This archive was generated by hypermail 2.1.5 : Thu Nov 25 2004 - 12:11:43 CST