UCS-2 and UTF-16

From: pr1@club-internet.fr
Date: Fri Sep 13 2002 - 03:51:29 EDT


Hello,

thank you for the useful information which you all provided.

However, I am now completely confused.

Let me recapitulate. Please tell me if any of my assertions are wrong.

1) According to the Microsoft Knowledge Base Article number
Q2322580, SQL Server 2000 stores data stored in UCS-2.

2) As far as Java 2 is concerned, UTF-16 is the same as UCS-2. That is,
in order to convert a String in a particular encoding (e.g. UTF-8), you
would use the following String constructor.

String myUCS2String = new String ( stringinSayUTF8.getBytes(),
"UTF-16" );

or

String myUCS2String = new String ( stringinSayUTF8.getBytes() ); since
UTF-16 is Java's default encoding.

3) Since JRun 3.1 uses the ISO8859_1 charset to pass parameters via
HTML headers, you must retrieve bytes using that encoding, convert
thoses bytes to UTF-16 before storing them in the SQL Server 2000
database:

byte[] byt = newFaqLibelle.getBytes( "ISO8859_1" );

String newFaqLibelleIsoVersUtf = new String( byt, "UTF-16" );

or

String newFaqLibelleIsoVersUtf = new String( byt );

<store in DB>

4) To retrieve multiple-byte characters from a SQL Server 2000 DB, you
must convert them back to UTF-8 as follows:

out.println( new String( myDataFromDB.getBytes(), "UTF-8" ) );

5) Since some JBDC drivers use the OS's default charset (Cp1252 in my
case), the above conversions are totally USELESS. I surmise that my
JDBC version is not nvarchar compatible. How can I find that out?
Unfortunately, you can't just type "jdbc -v" on Windows to find the jdbc
version.

What is strange is that the JRun 3.1 EJBs that were developed by my
predecessors store and retrieve Asian characters from the database
without any problems, whereas I am having lots of problems using JSPs,
JavaBeans + JDBC. Does anyone have any explanation for this?

Many thanks.

Best regards,

Philippe de Rochambeau



This archive was generated by hypermail 2.1.2 : Fri Sep 13 2002 - 04:28:38 EDT