Re: New UTF-8 decoder stress test file

From: Markus Kuhn (
Date: Mon Sep 27 1999 - 04:55:05 EDT

Dan Oscarsson wrote on 1999-09-27 06:58 UTC:
> Also, I think a good UTF-8 decode should handle and accept invalid
> UTF-8 sequences. My decoders do that, and because of that it can
> handle both ISO 8859-1 and UTF-8 encoded data.

Thanks for warning us about your UTF-8 decoders. They not only seem to
violate the letter of ISO 10646-1 Annex R, but also seem to be
predestined to become exactly the trouble makers in security
applications that RFC 2279 warns about. (See the comments in my test
file for details.)


Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at,  WWW: <>

