Re: Strange "UTF" Charset

From: Jukka K. Korpela (
Date: Mon Aug 08 2005 - 11:22:58 CDT

  • Next message: Peter R. Mueller-Roemer: "Re: Strange "UTF" Charset"

    On Mon, 8 Aug 2005, Tom Gewecke wrote:

    > Recently I came across a web site with the declared charset "BIGFIVE_TO_UTF"

    There is no such charset defined, of course. It sounds like a misguided
    attempt at giving the browser the completely unnecessary information about
    how the server transcoded the document before sending it, resulting in a
    failure to declare the encoding at all. (The <meta> tag specifies the
    above-mentioned charset, and the HTTP headers specify none.)

    > Apparently Win IE can display this page correctly, but no other browser
    > will.

    Opera seems to display the page well. The strange thing is that this
    happens even when the default encoding is not UTF-8. This is explained,
    and another puzzle is created, when I use View Source on Opera or IE:
    <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8">

    Thus, my guess is that the server for some reason sends garbage to
    Firefox. Probably Chinese characters were converted in some wrong way.
    In the markup it has
    <META HTTP-EQUIV="Content-Type" CONTENT="text/html;

    Firefox seems to send
    Accept-Charset: ???,utf-8;q=0.7,*;q=0.7
    in the HTTP headers, where ??? is the encoding set as default. (This is
    rather illogical; my preference for an encoding to be tried when a page
    fails to specify its encoding is logically quite independent of my
    preferences on the encoding of a page, in cases where a page exists in
    differently encoded forms.)
    Opera seems to send (when running on Windows XP)
    Accept-Charset: windows-1252, utf-8, utf-16, iso-8859-1;q=0.6, *;q=0.1
    And IE seems to send no Accept-Charset header.

    Thus, the presence of utf-8 with a quality number of 0.7 might have an
    effect in the case. But if I change the default encoding in Firefox to
    UTF-8, then it starts sending
    Accept-Charset: utf-8,*
    and the problem still remains.

    Jukka "Yucca" Korpela,

    This archive was generated by hypermail 2.1.5 : Mon Aug 08 2005 - 11:24:03 CDT