From: John Cowan (firstname.lastname@example.org)
Date: Sat Sep 27 2003 - 10:47:51 EDT
> First, the browser checks the HTTP header, then the XML declaration
> (which is not relevant to HTML), then the HTML meta tag.
> Apparently, upon finding character set information, the operation
> stops, so if information is present in the HTTP header, the meta
> tag won't be consulted.
It's worse than that. If the HTTP header says "text/xml" or "text/html",
and no charset information is provided, a fully conforming browser
MUST treat this as if the charset "us-ascii" is specified. That's
just insane, but such are the rules.
Only if there is no header, or if the header says "application/xml",
do we get to proceed to other sources of knowledge.
> All of the data should be consulted and there should be some kind
> of protocol in place to handle conflicting character set info.
It *is* in place and fully specified. It's just that most of us
don't care for the results, and most programs don't fully conform
for that reason.
-- Some people open all the Windows; John Cowan wise wives welcome the spring email@example.com by moving the Unix. http://www.reutershealth.com --ad for Unix Book Units (U.K.) http://www.ccil.org/~cowan (see http://cm.bell-labs.com/cm/cs/who/dmr/unix3image.gif)
This archive was generated by hypermail 2.1.5 : Sat Sep 27 2003 - 11:37:35 EDT