John,
It does impact developers.
The API for DataInputStream defines FSS_UTF, which includes the funky
null behavior.
http://java.sun.com/products/jdk/1.2/docs/api/java/io/DataInputStream.html
Since this API and other use this UTF, it gets into file formats and
applications
end up supporting it....
tex
John O'Conner wrote:
>
> Within a String, the encoding of char values is practically irrelevant. It is a
> hidden encoding that is never exposed to the user...or developer. When you access
> String char values, you use an index to 16-bit Unicode values. To my knowledge,
> Sun does not claim that its internal encoding of String is UTF-8 in any of its API
> documentation.
>
> Any component or converter that claims to produce a UTF-8 encoding should not
> behave as you describe. For example, Java's UTF-8 converter does not encode U+0000
> as 0xC0 0x80. If it ever does, please file a bug.
>
> Regards,
> John O'Conner
>
> DougEwell2@cs.com wrote:
>
> > This is laziness, intended to get around the "problem" of supplementary code
> > points instead of handling them like any other code points. This reminds me
> > of the Java bastardization of UTF-8, in which U+0000 is encoded 0xC0 0x80 so
> > that no character string will ever contain the byte 0x00. (Nobody has ever
> > explained to me why a character string would contain U+0000 in the first
> > place.)
-- According to Murphy, nothing goes according to Hoyle. -------------------------------------------------------------------------- Tex Texin Director, International Business mailto:Texin@Progress.com +1-781-280-4271 Fax:+1-781-280-4655 Progress Software Corp. 14 Oak Park, Bedford, MA 01730http://www.Progress.com #1 Embedded Database
Globalization Program http://www.Progress.com/partners/globalization.htm ---------------------------------------------------------------------------
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:18 EDT