Character converter

From: John O'Conner (
Date: Mon Apr 05 1999 - 15:45:36 EDT


Since I do not know how to enter UTF-8 on your platform, I wrote a short
Java application that converts a file from any supported charset encoding
to any other supported charset encoding. You can edit your HTML files in
whatever character set you have available, then run this app to convert it
to UTF-8. If it doesn't work for you, you can modify it until it does.

An example of correct syntax is the following:
java Converter oldfile.txt Big5 newfile.txt UTF-8

Have fun.

John O'Conner



  The Converter class converts a source input stream to a target output
  stream using a source and target encoding.
  @author John O'Conner
  @version 0.1

public class Converter {

  @param srcStream The input stream that should be converted
  @param srcEncoding The encoding of the input stream
  @param outStream The output stream
  @param outEncoding The encoding that should be us in the output stream

  public Converter(InputStream srcStream, String srcEncoding,
                    OutputStream outStream, String outEncoding)
                    throws UnsupportedEncodingException {

    in = new InputStreamReader(srcStream, srcEncoding);
    out = new OutputStreamWriter(outStream, outEncoding);


  public void convert() throws IOException {
    char[] ch = new char[CHAR_COUNT];
    int count;
      do {
        count =, 0, CHAR_COUNT);
      } while(count == CHAR_COUNT);

  public static void main(String[] args) {
    if (args.length != 4) {
      System.out.println("Syntax: Converter srcFileName srcEncoding "+
                          "targetFileName targetEncoding");
    } else {
    try {
      FileInputStream inFileStream = new FileInputStream(args[0]);
      FileOutputStream outFileStream = new FileOutputStream(args[2]);
      Converter conv = new Converter(inFileStream, args[1],
                                    outFileStream, args[3]);
    } catch(Exception e) {


  InputStreamReader in;
  OutputStreamWriter out;
  private static int CHAR_COUNT = 256;


Michael Everson <> on 04/05/99 10:55:49 AM

To: Unicode List <>
cc: (bcc: John O'Conner/QAD1)
Subject: Re: Netscape

Ar 13:42 -0400 1999-04-05, scríobh Winkler, Arnold F:
>Go to and download
>latest Netscape browser for the Mac, capable of supporting UTF-8

Yes, Arnold, I have Communicator 4.05. I need to know how to put Unicode
characters in my page though.

Michael Everson, Everson Gunn Teoranta **
15 Port Chaeimhghein Íochtarach; Baile Átha Cliath 2; Éire/Ireland
Guthán: +353 1 478-2597 ** Facsa: +353 1 478-2597 (by arrangement)
27 Páirc an Fhéithlinn;  Baile an Bhóthair;  Co. Átha Cliath; Éire

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:45 EDT