Re: New UTF-8 decoder stress test file

From: Valeriy E. Ushakov (uwe@ptc.spbu.ru)
Date: Sun Sep 26 1999 - 13:10:40 EDT


On Sun, Sep 26, 1999 at 09:22:26AM -0700, Markus Kuhn wrote:

> 4.3 Overlong representation of the NUL character
>
> The following five sequences should also be rejected like malformed
> UTF-8 sequences and should not be treated like the ASCII NUL
> character.
>
> 4.3.1 U+0000 = c0 80 = "?"

I belive that's exactly what JDK uses to encode U+0000 in utf-8
encoded NUL terminated C strings to distinguish U+0000 which is part
of a string from the terminating NUL. I can't find the reference,
though.

SY, Uwe

-- 
uwe@ptc.spbu.ru                         |       Zu Grunde kommen
http://www.ptc.spbu.ru/~uwe/            |       Ist zu Grunde gehen



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:53 EDT