From: Martin Duerst ([email protected])
Date: Wed Nov 22 2006 - 00:10:07 CST
At 08:36 06/11/22, John Cowan wrote:
>
>Richard Ishida scripsit:
>
>> 2. what is doubly-encoded utf-8?
>
>Text encoded as UTF-8, then reinterpreted using an 8-bit encoding (often
>Latin-1 or Windows-1252), and then re-encoded incorrectly as UTF-8 for
>a second time.
Yes. The W3C site has quite a lot of these, too, even if they are
fortunately usually limited to single characters such as the copyright
sign. Here's an example:
http://www.w3.org/2001/Annotea/User/Papers.html
They are often the result of the download path and the upload path
being different in terms of how they handle character encoding information.
Regards, Martin.
#-#-# Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-# http://www.sw.it.aoyama.ac.jp mailto:[email protected]
This archive was generated by hypermail 2.1.5 : Wed Nov 22 2006 - 16:28:32 CST