From: Philippe Verdy (firstname.lastname@example.org)
Date: Thu Nov 25 2004 - 12:08:47 CST
You just need a mapping table from Unicode codepoints to Shift-JIS code positions, and a very simple code point parser to translate UTF-8 into Unicode code points.
You'll find a mapping table in the Unicode UCD, on its FTP server. The UTF-8 form is fully documented in the Conformance section of the Unicode standard and requires no table to convert UTF-8 to 21-bit Unicode codepoints.
There are existing tools that perform that for you, because they integrate both:
- Java (international edition) has a Shift-JIS mapping to Unicode which is reversible. It is used with the Charset support in java.io.* and java.nio.* packages and classes. You can even use the prebuilt tool native2ascii (from the Java SDK) to do that:
native2ascii -encoding UTF-8 < filename.UTF-8.txt
| native2ascii -reverse -encoding SHIFT-JIS > filename.SHIFT-JIS.txt
- GNU recode on Linux/Unix may do that for you too.
- the Open-Sourced ICU offered by IBM has an API and support mappings for lots of charsets.
----- Original Message -----
Sent: Thursday, November 25, 2004 6:00 AM
Subject: Shift-JIS conversion.
Can anyone please tell me how to convert from UTF-8 to shift-JIS?
Please let me know if there is any formula to do it other than using readymade functions as provided by pearl. Because these functions do not provide mapping for all characters.
Cybage Software Private Ltd.
This archive was generated by hypermail 2.1.5 : Thu Nov 25 2004 - 12:11:43 CST