RE: Encodings for SQL Databases

From: Michael Kung (mkung@microsoft.com)
Date: Mon Aug 07 2000 - 15:56:25 EDT


SQLServer 7.0 and SQLServer 2000 are surrogate safe on the
NCHAR/NVARCHAR/NTEXT storage. Not until the ISO standard accepts the
surrogate assignment, any surrogate support statement does not provide any
substantial context.

Michael

-----Original Message-----
From: Michael (michka) Kaplan [mailto:michka@trigeminal.com]
Sent: Monday, August 07, 2000 9:01 AM
To: Unicode List
Subject: Re: Encodings for SQL Databases

From: <Marco.Cimarosti@icl.com>

> According to the online help of SQL Server 7.0, you have to
> use the syntax N'abc' to write a Unicode literal in a SQL
> statement.
>
> The N prefix echoes the N in NCHAR and NVARCHAR, and
>parallels the L"abc" syntax of C (but I wonder, what's that "N"
> for? One would expect W[ide], L[ong], or U[nicode]).

This stands for "National" and comes from the ANSI-92 specification for SQL
(pardon the political incorrectness!).

> I then tried saving the script with "Save As...". The choices
> where "ANSI", "OEM (cp 437)", and "Unicode". Guess which
> one I chose, and it saved the file in the UTF-16 (or is it UCS-2?)
> format that is accepted by Notepad (find the file attached).

Technically speaking, UCS-2 might be more accurate since SQL 7.0 does not
have surrogate awareness. SQL Server 2000 has some surrogate awareness, and
the sorting of such characters is currently undefined, but I g uess you
could claim it to be UTF-16 (although the docs do not do so).

> About API's, I guess that:
> 1) The N prefix for string literals should be used as well;

Yes.

> 2) The details of the UTF form used are handled by the API.

Yes, plus. :-)

michka

Michael Kaplan
Trigeminal Software, Inc.
http://www.trigeminal.com/



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:06 EDT