Re: Unicode conference papers

From: Martin Duerst (
Date: Wed Nov 22 2006 - 00:10:07 CST

  • Next message: Kenneth Whistler: "Re: Fwd: Creative commons' license symbols"

    At 08:36 06/11/22, John Cowan wrote:
    >Richard Ishida scripsit:
    >> 2. what is doubly-encoded utf-8?
    >Text encoded as UTF-8, then reinterpreted using an 8-bit encoding (often
    >Latin-1 or Windows-1252), and then re-encoded incorrectly as UTF-8 for
    >a second time.

    Yes. The W3C site has quite a lot of these, too, even if they are
    fortunately usually limited to single characters such as the copyright
    sign. Here's an example:

    They are often the result of the download path and the upload path
    being different in terms of how they handle character encoding information.

    Regards, Martin.

    #-#-# Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University

    This archive was generated by hypermail 2.1.5 : Wed Nov 22 2006 - 16:28:32 CST