>I have added a couple more variations of the Unicode supplementary
>characters example page, for utf-16 and utf-32.
I am not sure if your UTF-16 and UTF-32 test pages really conform to the
HTML standard. The server states a content type of "text/html" without
charset information. From the content type a browser should therefore
expect pure ASCII - at least until the META tag defining the documents
From the HTML 4.01 specification <http://www.w3.org/TR/html4/
charset.html>, section 5.2.2:
"The META declaration must only be used when the character encoding is
organized such that ASCII-valued bytes stand for ASCII characters (at
least until the META element is parsed)."
Your documents, however, just start with a BOM and I couldn't find
anything stating that a BOM would be a valid way of specifying the
Although some browsers seem to guess the character encoding from an
available BOM I wouldn't expect them to do so when there usually are
other ways of determining this information.
To get a second opinion I asked w3.org's online validation service to
check your UTF-16 document with auto detection of the character encoding.
The Validator complained about the BOM as well as (not surprisingly) a
lot of ASCII zero (0x00) characters.
However, when giving the validator a ASCII-only document with a META tag
specifying UTF-16 as encoding (just for testing) it says that it does not
yet support this encoding, so I don't fully trust the validator in this case.
-- Steffen Kamp mailto:firstname.lastname@example.org http://homepage.mac.com/earthlingsoft
This archive was generated by hypermail 2.1.2 : Fri Apr 19 2002 - 18:17:05 EDT