RE: Encodings for SQL Databases

Date: Mon Aug 07 2000 - 11:47:57 EDT

((( Sorry to those who see a mangled subject. It should read "RE: Encodings
for SQL Databases" )))

Jon Peck wrote:
> Most of the major databases now support Unicode at some
> level, but what is
> the best way to encode SQL statements for various database
> access apis? [...]

According to the online help of SQL Server 7.0, you have to use the syntax
N'abc' to write a Unicode literal in a SQL statement.

The N prefix echoes the N in NCHAR and NVARCHAR, and parallels the L"abc"
syntax of C (but I wonder, what's that "N" for? One would expect W[ide],
L[ong], or U[nicode]).

I tried the following code in Query Analyzer. The example comes from the
help; I substituted the Danish string with a Chinese one to be sure that
characters >= U+0100 behaved OK.

        DECLARE @nstring nchar(8)
        SET @nstring = N'你好'
        SELECT UNICODE(SUBSTRING(@nstring, 2, 1)),
        NCHAR(UNICODE(SUBSTRING(@nstring, 2, 1)))

The result is:

        ----------- ----
        22909 好

        (1 row(s) affected)

Where 22909 = 0x597D, which is in fact the code of the 2nd character in the
string: "好" (hao3).

The Chinese characters were visible in the Query Analyzer's window, as soon
as I selected a proper font.

I then tried saving the script with "Save As...". The choices where "ANSI",
"OEM (cp 437)", and "Unicode". Guess which one I chose, and it saved the
file in the UTF-16 (or is it UCS-2?) format that is accepted by Notepad
(find the file attached).

About API's, I guess that:

1) The N prefix for string literals should be used as well;

2) The details of the UTF form used are handled by the API.

_ Marco


This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:06 EDT