UTF-16 and HTML META charset

From: Erik van der Poel (erik@netscape.com)
Date: Mon Feb 14 2000 - 23:38:39 EST


Piotr Trzcionkowski wrote:
>
> all my pages in utf-16 have a proper meta declaration.

HTML's META charset does not work for non-ASCII-based character
encodings, such as UTF-16, UCS-4 and EBCDIC. Some browser versions may
autodetect UTF-16 based on the presence of zero-valued octets and/or the
BOM. The most standard way to declare UTF-16 in HTTP is via
Content-Type. For example:

  Content-Type: text/html; charset=UTF-16BE

See the following for the definition of UTF-16BE:

  http://www.ietf.org/internet-drafts/draft-hoffman-utf16-05.txt

(By the way, when is this Internet Draft going to become an RFC?)

> http://www.trzcionk.priv.pl/

I noticed that the above document appears to be in "little-endian"
Unicode. Try copying and pasting the above URL into my HTTP/HTML source
viewer:

  http://webtools.mozilla.org/web-sniffer/

Erik



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:59 EDT