Re: HTML Form Encoding Problem

From: Erik van der Poel (
Date: Fri Jan 14 2000 - 13:52:13 EST

I think the best short-term solution is MSIE5's _charset_ hack, and the
best long-term solution is ENCTYPE multipart/form-data.

The _charset_ hack works on all current browsers, as long as the user is
not using a translator (e.g. French -> Traditional Chinese) or a
transcoding intermediary (proxy, gateway, etc). Translators and
transcoders are relatively rarely used, so this short-term solution is
probably the best.

1. If at all possible, append the charset parameter to the HTTP
Content-Type header of the document containing the <FORM> element.

  Content-Type: text/html; charset=iso-8859-1

If a CGI is used to *send* the form (as opposed to *receiving* the
user's form submission), then it is trivial to add the charset
parameter. Otherwise, you need to fiddle with your server's settings,
which vary from vendor to vendor. The HTTP charset is important because
the HTML META charset is not always very reliable.

2. Add the HTML META charset to the document containing the <FORM>

<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">

Even if you add the charset to the HTTP header, it is still useful to
have an HTML META charset, since the user might save the document to a
file, thereby losing the HTTP header info. Put the META element as close
to the beginning as possible.

<meta http-equiv="...

3. Use the _charset_ hack in each <FORM> element.

<input type="hidden" name="_charset_" value="iso-8859-1">


Note that special strings in hidden fields would get munged if the browser cannot reliably determine the charset of the document. That approach doesn't seem to be worth the trouble.

The multipart/form-data solution is not implemented correctly in current browsers, but hopefully it will be in the future, and it doesn't suffer from the translator/transcoder problems.


This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:58 EDT