Re: Frequent incorrect guesses by the charset autodetection in IE7

From: James Kass (
Date: Thu Jul 13 2006 - 18:25:41 CDT

  • Next message: Sinnathurai Srivas: "Re: Frequent incorrect guesses by the charset autodetection in IE7"

    Philippe Verdy wrote,

    > The autodetection mechanism is definitely broken, as it even breaks the HTML
    > code and structure (invalid tags generated, script errors, broken links with
    > incorrect syntax, broken javascripts), including at the most basic level (html
    > tags); and it even interpret now invalid JIS codes that are displayed as squared
    > boxes or question marks.

    The autodetection mechanism may be broken, but it can't really be blamed
    for breaking the HTML code and structure. Without a character set
    declaration, the HTML code is already broken. No HTML validator should
    pass such a page.

    > > Sounds like they need a volunteer. ...
    > I have signaled that to them. But in fact the same is true for the web site of
    > the International Comity of the Red Cross ( in its French and Spanish
    > pages, even if it occurs less often; there also, there's no charset declaration.

    The French language web pages of Croix-Rouge canadienne ...

    ... load just fine here. The page linked above has the character
    set declaration as iso-8859-1 and displays properly on MSIE 6.0
    with auto-detect disabled and the encoding set to UTF-8.

    It might be pointed out to the French web site people that pages
    properly tagged should load correctly. "En tout lieu. En tout temps."

    > Changing the web site to UTF-8 would be a NO option (too many changes, and not
    > enough resources to verify and reencode the changes, including in ASP pages and
    > database interfaces; remember that this is not a computing organization, their
    > engineering resources are very limited, and they have very limited budgets to
    > update their website, as part of their communication/advertizing costs, as most
    > of their money go to their field action in health care, assistance, and
    > education programs).

    It's unfortunate that there is no option for them to convert their
    pages to UTF-8. How will they publish pages in African languages
    like Yoruba?

    I do understand the point that you are making concerning budget
    considerations for non-profit organisations, but you also mention
    that this is a common problem with for-profit web sites, too.

    This common problem boils down to a failure on the part of the web
    page authors to validate their HTML.

    Browsers which display bad HTML as if there were no problems
    promote the proliferation of bad HTML web pages.

    Best regards,

    James Kass

    This archive was generated by hypermail 2.1.5 : Thu Jul 13 2006 - 18:31:34 CDT