Re: Bastardizations of UTF-8 (was: Re: [OT] Unicode-compatible SQL?)

From: John O'Conner (john.oconner@eng.sun.com)
Date: Mon Feb 05 2001 - 15:13:22 EST


Perhaps the methods readUTF and writeUTF should be deprecated in favor of
read/writeString. I will submit an RFE (request for enhancement) for this.

I noticed that although the Data{Input,Output} interface clearly says that the
write/readUTF handles a "Java modified UTF-8". The actual javadoc in DataOutputStream
says that writeUTF writes the String as UTF-8. Also, the doc for UTFDataFormatException
is confusing on the issue, saying UTF-8 in one place and "modified UTF-8" in the doc for
DataInputStream.

Thats 1 RFE for better method names and 2 bugs in the API documentation! I'll submit all
3...if they don't already exist in the db.

Regards,
John O'Conner

John Cowan wrote:

> The internal encoding is exposed by the regrettably named readUTF and
> writeUTF methods of java.io.Data{Input,Output}Stream, which should have
> been named readString and writeString. People have assumed that they
> are general-purpose UTF-8 read/write functions.
>



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:18 EDT