From: Philippe Verdy (email@example.com)
Date: Thu May 11 2006 - 18:37:01 CDT
From: "Jukka K. Korpela" <firstname.lastname@example.org>
> On Thu, 11 May 2006, Tom Gewecke wrote:
>> If anyone on the list is running Win IE 7b2, could they let me know whether
>> it also has IE 6's behavior of displaying bad UTF-8 as if it were correct?
>> The test page is
> Yes it has. (Tested on 7.0.5346.5.)
I have the same reply.
Is there any reason for IE6/IE7 to be such much "tolerant" about invalid UTF-8? Are there really lots of processes that produce documents encoded with invalid UTF-8? I don't know any one (not even from Microsoft itself).
Doesn't it break or severely limits the encoding autodetection in IE? This may explain why IE so often displays Chinese characters in the middle of a French webpage hosted on a server that simply does not specify its actual encoding: IE returns a false positive match with UTF-8, instead of identifying the ISO-8859-1 encoding that was actually used.
This is a severe and very ennoying bug for users (like French users trying to read webpages that were encoded as ISO-8859-1 but interpreted by default as UTF-8 as if it was Chinese, even though it would be invalid UTF-8).
And even if the server pretends that the webpage was encoded with UTF-8, IE should still apply the strict rules, and then either:
- reject the document (asking theuser to select another encoding, or to try to reload it)
These solutions should be applied similarily to unpaired surrogates.
This archive was generated by hypermail 2.1.5 : Thu May 11 2006 - 18:41:48 CDT