I suggest that you look at
http://www.w3.org/TR/REC-html40/appendix/notes.html#h-B.2.1 for the
You can look at http://www.w3.org/International/ for a more general
From: Magda Danish (Unicode) [mailto:firstname.lastname@example.org]
Sent: Friday, October 06, 2000 9:52 AM
To: Unicode List
Subject: FW: information request; using unicode in HTML form; urlencoded
From: Hung Le [mailto:email@example.com]
Sent: Thursday, October 05, 2000 3:21 PM
Subject: information request; using unicode in HTML form; urlencoded
Our company is exploring the idea of using Unicode in our web pages.
We ran into a problem that, despite our effort researching for the last two
are not able to find an answer. The problem is related to passing text from
an HTML form to the webserver.
From the user's perspective:
. we present the user a web page with a form.
. user fills the form
. user click on "Submit"
. the browser post the data entered to the server
From what I can gather so far, the data flow is followed:
. when the user click on the submit button, the browser
data using the following algorithm:
The ASCII characters 'a' through 'z', 'A' through 'Z', and '0' through '9'
remain the same.
The space character ' ' is converted into a plus sign '+'.
All other characters are converted into the 3-character string "%xy", where
xy is the two-digit hexadecimal representation of the lower 8-bits of the
The last rule will clip Unicode charater to an 8-bit
thus the data entered to the HTML form will not make it back to the web
Have you have experience in this area? How does one capture the data
an HTML form in Unicode and send it along when user click on the "Submit"
Thanks for any help you can provide.
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:14 EDT