Re: Character converter

From: Jungshik Shin (jshin@pantheon.yale.edu)
Date: Mon Apr 05 1999 - 21:47:01 EDT


On Mon, 5 Apr 1999, John O'Conner wrote:

> Since I do not know how to enter UTF-8 on your platform, I wrote a short
> Java application that converts a file from any supported charset encoding
> to any other supported charset encoding. You can edit your HTML files in
> whatever character set you have available, then run this app to convert it
> to UTF-8. If it doesn't work for you, you can modify it until it does.
>
> An example of correct syntax is the following:
> java Converter oldfile.txt Big5 newfile.txt UTF-8

   I'm afraid you reinvented the wheel. 'native2ascii' included in JDK
(I haven't had a chance to look at JDK other than Unix version, but I
believe it's available for JDK in other OS' as well) can do the same
thing (although in two steps instead of one).

   native2ascii -encoding SRC_ENCODING_NAME file.txt | \
   native2ascii -reverse -encoding UTF8 > newfile.txt

   It should be trivial to write a shell script(in Mac OS 8.x, perhaps
AppleScript) to make it simpler. So, native2ascii is a handy
tool(although pretty "expensive" but "universal" to the extent that Java
is "universal") to convert between many different encodings. POSIX
compliant system may offer a "less expesnive"(?) utility and API call
iconv(1) and iconv(3).

     Jungshik Shin



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:45 EDT