Re: HTML forms and UTF-8

From: yergeau@alis.com
Date: Mon Nov 08 1999 - 20:25:03 EST


> De: Chris Wendt [mailto:christw@microsoft.com]
> Date: dimanche 7 novembre 1999 22:30
>
> (note the dash instead of underscore)

Oops!

> Internet Explorer 5:
>
> 1) Sets a hidden field named "_charset_" to the encoding the
> FORM data was
> submitted in. ("_charset_" name includes the underscores).

That alone goes a long, long way towards fixing the broken form submission
mechanism.

> 2) Submits in UTF-8 if Accept-Charset="UTF-8" is given in the
> <form> and
> input is found which does not fit into the form page's encoding.

Why the latter condition? It seems to me that if the form author says
Accept-Charset="UTF-8", that's what he wants. This behaviour also seems
less deterministic, saying UTF-8 does not cause UTF-8 to be returned but MAY
cause it, depending on what each user types.

> You could prepopulate the _charset_ field with the form
> page's encoding so
> it always gets returned to your CGI.

Er, no, don't do that! That would break in case of transcoding

Thanks for those good news.

--
François



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:56 EDT