Re: New UTF-8 decoder stress test file

From: Glen Perkins (Glen.Perkins@nativeguide.com)
Date: Sun Sep 26 1999 - 15:29:46 EDT


That's true, but only for the .class file's internal data encoding scheme.
The internal encoding of a .class file doesn't matter to Java developers
unless they want to build an app that directly parses the class file, such
as a decompiler. Just think of it as a proprietary file format.

Java developers who use Java's standard APIs for UTF-8 in their own (and
others') apps get the standard UTF-8.

Glen Perkins

----- Original Message -----
From: Valeriy E. Ushakov <uwe@ptc.spbu.ru>
To: Unicode List <unicode@unicode.org>
Cc: Unicode List <unicode@unicode.org>; <linux-utf8@humbolt.geo.uu.nl>
Sent: Sunday, September 26, 1999 10:11 AM
Subject: Re: New UTF-8 decoder stress test file

> On Sun, Sep 26, 1999 at 09:22:26AM -0700, Markus Kuhn wrote:
>
> > 4.3 Overlong representation of the NUL character
> >
> > The following five sequences should also be rejected like malformed
> > UTF-8 sequences and should not be treated like the ASCII NUL
> > character.
> >
> > 4.3.1 U+0000 = c0 80 = "?"
>
> I belive that's exactly what JDK uses to encode U+0000 in utf-8
> encoded NUL terminated C strings to distinguish U+0000 which is part
> of a string from the terminating NUL. I can't find the reference,
> though.
>
> SY, Uwe
> --
> uwe@ptc.spbu.ru | Zu Grunde kommen
> http://www.ptc.spbu.ru/~uwe/ | Ist zu Grunde gehen
>



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:53 EDT