Re: Mojibake on my Web pages

From: jon@spin.ie
Date: Wed Sep 24 2003 - 14:11:05 EDT

  • Next message: Steve Pruitt: "need help understanding diacritical encoding"

    > > It is very irritating that the HTTP header overrules the <meta>
    > tag,
    > > since it seems that the error is more often in the HTTP header than in
    > > the <meta> tag.
    >
    > Indeed. You'd think if the author (or software) included a <meta> tag
    > AND an explicit declaration in the XML header, he (or it) knew what he
    > (or it) was doing and the tag(s) should be honored.

    Experience shows that there is no reason for assuming this degree of competence on the part of authors, certainly not over the degree of competence you assume for server administrators.

    However, rather than being a judgement call on whether authors are more likely to include incorrect declarations (which they are) or server administrators to set incorrect headers (which they are also), the policy of having the HTTP header over-ride the contained declaration has a sound technical basis:

    The author was not the last entity to "touch" the document, the server was. As such the server could have re-encoded the document (as some servers and other agents may do with text/* documents) without altering any self-description features specific to that particular type of document. As such assuming a reasonable degree of competence on the part of both author and server only the server's description of the encoding can be trusted.

    In practice it doesn't work like that and browsers have to add features to enable users to manually change the encoding.

    Maybe including a BOM would help the browser realise something was awry, but it's just as likely to think the author just wrote an invalid document that began with 



    This archive was generated by hypermail 2.1.5 : Wed Sep 24 2003 - 14:59:14 EDT