Re: [OT] Unicode-compatible SQL?

From: Jianping Yang (
Date: Thu Feb 01 2001 - 19:37:14 EST

That's not very true here about Oracle Unicode support. As there is no surrogate
character defined yet, Oracle is intended to use 3-byte encoding for UTF-8 as
performance and semantics reason. To keep the same binary order as UTF-16 that
commonly used in NT and Java client, Oracle UTF8 character set uses a 3-byte
pair for surrogate, which means you can still support UTF-16 by UTF8.

In Oracle 9i release, we will have a solution for UTF-8 byte issue by providing
character semantics and you can create your varchar2 column in the unit of
character independent of database character set. For example 'col1 varchar2(10
char)' can store 10 UTF-16 code points even the database character set is UTF8.
In 9i, we will support another UTF-8 character set as AL32UTF8 which will use
4-byte encoding for surrogate and this can be used for client character set for
UTF-8 compliance. In 9i, Oracle's NCHAR will support UTF-16 in additional to
UTF8, so you can choose the right encoding which is best for your storage, but
with the same SQL semantics.


"Carl W. Brown" wrote:

> Tague,
> "datatypes which are always UCS-2"
> You have to be careful with Oracle because you can only use the UCS-2 subset
> of UTF-8. The Oracle CLOB is also stored as UCS-2 not UTF-8. One of the
> problems with UTF-8 is field sizing, as the fields must be sized in UTF-8
> bytes not characters.
> There are advantages and disadvantages of Oracle and SQL Server.
> Carl
> -----Original Message-----
> From: Tague Griffith []
> Sent: Monday, January 29, 2001 6:33 PM
> To: Unicode List
> Cc: Unicode List
> Subject: Re: [OT] Unicode-compatible SQL?
> My recomendation for using Unicode with a database would be Oracle.
> Oracle supports Unicode (as UTF-8) quite well and has the language data
> for many locales as part of the universal install. I also prefer that
> it is easier to configure the database character set independently of
> the OS localization and that you aren't limited to using nvarchar,
> nchar, etc. datatypes if you want to use Unicode.
> SQLServer also has Unicode support through the use of nvarchar, nchar,
> etc datatypes which are always UCS-2. SQLServer will probably be a
> cheaper and simpler option to install (although i don't find the Oracle
> install all that complex).
> Unfortunately, MySQL currently doesn't support Unicode, possibly in a
> future version.
> /t
> Elaine Keown wrote:
> >
> > Hello,
> >
> > A friend who is off-list asked me to inquire about Unicode-compatible SQL
> and database options. He has been told that Microsoft SQL is now available
> in Unicode, but he usually uses the Macintosh. What other options are there
> now in the various kinds of databases, probably on the smaller end of the
> scale?
> > He's interested in the database having the capability of having a Web
> front-end--I think that's absolutely critical for him. He's doing Hebrew
> and a tiny bit of Aramaic in his project, so I think he would need at least
> Unicode 2.0 compatibility, with Unicode 3.0 the best choice.
> >
> > Elaine Keown
> >
> > Find the best deals on the web at AltaVista Shopping!
> >

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:18 EDT