Re: Java and UTF

From: Glen Perkins (gperkins@netcom.com)
Date: Wed Jul 02 1997 - 13:57:36 EDT


Pierre"Daniel R. Kegel" <dank@alumni.caltech.edu> wrote:
>
> Pierre wrote:
> > I know about these, and, because they also store the length (a binary
> >number), they're useless for, say, a Unicode plain-text editor.
>
> No, they're still quite useful; you just have to strip off the
> length. Use a memory stream rather than a file stream,
> strip off the two extra bytes, and write the buffer to disk. Voila!
>
> I am still looking for better UTF methods in Java- I think the String
> class might in fact have a toByteArray() sort of method- but I think
> it's fair to say that Java can do UTF, and you can write a UTF-8
> plain text editor in it without too much fuss.
> - Dan

No, no. It's easy. Here's a piece of a slide from a presentation I've
given a couple of times on Java i18n:

=======================================
TO OUTPUT IN UTF-8:

The best way for plain text is the new way, as in the previous examples:

try
{
  PrintWriter out = new PrintWriter(
           new BufferedWriter(
           new OutputStreamWriter(
           new FileOutputStream("out.unicode"), "UTF8")));

  out.println("Hello, \u65e5\u672c");
  out.close();
}
catch (Exception e) {System.out.println(e);}

Output from the above (contents of 'out.unicode' file):
48 65 6C 6C 6F 2C 20 E697A5 E69CAC 0D 0A
H e l l o , ni hon CR LF

==========================================

That'll do it for you.

__Glen Perkins__
glen.perkins@nativeguide.com



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:35 EDT