Re: Fun with proof by analogy, was Re: Mojibake on my Web pages

From: John Cowan (
Date: Sat Sep 27 2003 - 10:47:51 EDT

  • Next message: Chris Jacobs: "Re: font creation software for Unicode Hebrew proposal ?" scripsit:

    > First, the browser checks the HTTP header, then the XML declaration
    > (which is not relevant to HTML), then the HTML meta tag.
    > Apparently, upon finding character set information, the operation
    > stops, so if information is present in the HTTP header, the meta
    > tag won't be consulted.

    It's worse than that. If the HTTP header says "text/xml" or "text/html",
    and no charset information is provided, a fully conforming browser
    MUST treat this as if the charset "us-ascii" is specified. That's
    just insane, but such are the rules.

    Only if there is no header, or if the header says "application/xml",
    do we get to proceed to other sources of knowledge.

    > All of the data should be consulted and there should be some kind
    > of protocol in place to handle conflicting character set info.

    It *is* in place and fully specified. It's just that most of us
    don't care for the results, and most programs don't fully conform
    for that reason.

    Some people open all the Windows;       John Cowan
    wise wives welcome the spring 
    by moving the Unix.           
      --ad for Unix Book Units (U.K.)

    This archive was generated by hypermail 2.1.5 : Sat Sep 27 2003 - 11:37:35 EDT