UTF-8

From: Stephen Toner (toners5@hotmail.com)
Date: Tue Sep 19 2000 - 10:23:00 EDT


Hi,
I am still having trouble with inputted UTF-8 from a browser. The problem is that my database can't store UTF-8 but only UTF-16. I have tried to convert between the two with little success. The trouble is that the inputted string is obtained from the request object using String temp=request.getParameter("TheText");
This leaves me with a string which I think(Please correct me if I'm wrong) is correctly encoded in UTF-8 (For example a japanese character was converted to a 3-byte sequence.- ,) However the String API only allows me to convert a byte array containing non-Unicode text to Unicode or you can convert a String object into a byte array of non-Unicode characters. But what I have is a string of non-Unicode characters which I must convert to Unicode characters. I tried converting it to bytes, which without specifying the encoding left 2 question marks in, and with specifying the encoding as UTF-8 just converted each character to UTF-16 giving 6 bytes instead of the 2 bytes that I wanted. If I was able to somehow get the byte values for each character I would be flying, but unfortunately a load of different characters get converted to 3F- the code for a question mark.
Does anyone know of any way of converting directly in Java?
Also when I submit a form page with the encoding specified as UTF-8 what actually does the converting from what is in the form to UTF-8?
Thanks for any help,
Stephen



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:13 EDT