What exactly happens when I use the <%@ page contentType="text/html; utf-8"
%> directive. When I include this for example letters in the database which
were stored correctly are rendered incorrectly. They come in exctly the
same form the form- but when I try to output them to a page with this
directive it doesn't combine the UTF-8 bytes to form the character, and
instead treats the bytes as seperate characters. Without it I just use the
meta tag to interprete the bytes as UTF-8.
Hi Stephen,

Java's internal encoding is UTF-16. Every String is encoded as
UTF-16. Since no web pages are generated in that encoding, JSP provides a
basic mechanism for setting up a character set converter (essentially an
InputStreamReader and an OutputStreamReader).

The default page encoding for JSP is ISO-8859-1. The processing page will
hand you UTF-8 instead of 8859-1 if you use the <%@ page
contentType="text/html; utf-8" %> directive in your page.

If you wish to receive a UTF-8 "POST" or "GET" in an 8859-1 page, you will
need to setup the InputStreamReader to convert the characters yourself. I
know I'm being sketchy here, but I'm running late this morning. Let me
know if the contentType directive doesn't fix your problem.



> Hi,
> I am still having trouble with inputted UTF-8 from a browser. The problem
is that my database can't store UTF-8 but only UTF-16. I have tried to
convert between the two with little success. The trouble is that the
inputted string is obtained from the request object using String
> This leaves me with a string which I think(Please correct me if I'm wrong)
is correctly encoded in UTF-8 (For example a japanese character was
converted to a 3-byte sequence.- ,) However the String API only allows me
to convert a byte array containing non-Unicode text to Unicode or you can
convert a String object into a byte array of non-Unicode characters. But
what I have is a string of non-Unicode characters which I must convert to
Unicode characters. I tried converting it to bytes, which without
specifying the encoding left 2 question marks in, and with specifying the
encoding as UTF-8 just converted each character to UTF-16 giving 6 bytes
instead of the 2 bytes that I wanted. If I was able to somehow get the byte
values for each character I would be flying, but unfortunately a load of
different characters get converted to 3F- the code for a question mark.
> Does anyone know of any way of converting directly in Java?
> Also when I submit a form page with the encoding specified as UTF-8 what
actually does the converting from what is in the form to UTF-8?
> Thanks for any help,
> Stephen

