RE: Java, SQL, Unicode and Databases

From: Michael Kaplan (Trigeminal Inc.) (v-michka@microsoft.com)
Date: Fri Jun 23 2000 - 17:40:59 EDT


The datatype *does* matter in that sense.... you would use UTF-16 data
fields (NTEXT and NCHAR and NVARCHAR) and access it with your favorite data
access method, which will convert as needed to whatever format IS uses. You
will never know oc care what the underlying engine stores.

The web site stuff will not work for you since you would have to do the
extra conversions to do the data mining, so you would probably go with plan
"A".

My general point is that OLE DB to an Oracle UTF-8 field and to a SQL Server
UTF-16 field all return the same type of data.... UTF-16. So COM in this
case is hiding the differences.

Michael

> ----------
> From: Joe_Ross@tivoli.com[SMTP:Joe_Ross@tivoli.com]
> Sent: Friday, June 23, 2000 2:27 PM
> To: Michael Kaplan (Trigeminal Inc.)
> Cc: Unicode List; Hossein_Kushki%IBMCA@tivoli.com
> Subject: RE: Java, SQL, Unicode and Databases
>
>
>
> Michael, are you saying that the data type (char or nchar) doesn't matter?
> Are
> you saying that if we just use UTF-16 or wchar_t interfaces to access the
> data
> all will be fine and we will be able to store multilingual data even in
> fields
> defined as char? Maybe things aren't as bad as I feared.
>
> With respect to the web applications you describe, do they store the UTF-8
> as
> binary data? This wouldn't work for us, since we want other data mining
> applications to be able to access the same data.
>
> Thanks,
> Joe
>
> "Michael Kaplan (Trigeminal Inc.)" <v-michka@microsoft.com> on 06/23/2000
> 10:41:39 AM
>
> To: Unicode List <unicode@unicode.org>, Joe Ross/Tivoli Systems@Tivoli
> Systems
> cc: Hossein Kushki@IBMCA
> Subject: RE: Java, SQL, Unicode and Databases
>
>
>
>
> Microsoft is very COM-based for its actual data access methods.... and COM
> uses BSTRs that are BOM-less UTF-16. Because of that, the actual storage
> format of any database ends up irrelevant since it will be converted to
> UTF-16 anyway.
>
> Given that this is what the data layers do, performance is certainly
> better
> if there does not have to be an extra call to the Windows
> MutliByteToWideChar to convert UTF-8 to UTF-16. So from a Windows
> perspective, not only is it no trouble, but it also the best possible
> solution!
>
> In any case, I know plenty of web people who *do* encode their strings in
> SQL Server databases as UTF-8 for web applications, since UTF-8 is their
> preference. They are willing to take the hit of "converting themselves"
> because when data is being read it is faster to go through no conversions
> at
> all.
>
> Michael
>
> > ----------
> > From: Joe_Ross@tivoli.com[SMTP:Joe_Ross@tivoli.com]
> > Sent: Friday, June 23, 2000 7:55 AM
> > To: Unicode List
> > Cc: Unicode List; Hossein_Kushki%IBMCA@tivoli.com
> > Subject: Re: Java, SQL, Unicode and Databases
> >
> >
> >
> > I think that this is also true for DB2 using UTF-8 as the database
> > encoding.
> > From an application perspective, MS SQL Server is the one that gives us
> > the most
> > trouble, because it doesn't support UTF-8 as a database encoding for
> char,
> > etc.
> > Joe
> >
> > Kenneth Whistler <kenw@sybase.com> on 06/22/2000 06:42:20 PM
> >
> > To: "Unicode List" <unicode@unicode.org>
> > cc: unicode@unicode.org, kenw@sybase.com, mgm@sybase.com (bcc: Joe
> > Ross/Tivoli
> > Systems)
> > Subject: Re: Java, SQL, Unicode and Databases
> >
> >
> >
> >
> > Jianping responded:
> >
> > >
> > > Tex,
> > >
> > > Oracle doesn't have special requirement for datatype in JDBC driver if
> > you use
> > UTF8 as database
> > > character set. In this case, all the text datatype in JDBC will
> support
> > Unicode data.
> > >
> >
> > The same thing is, of course, true for Sybase databases using UTF-8
> > at the database character set, accessing them through a JDBC driver.
> >
> > But I think Tex's question is aimed at the much murkier area
> > of what the various database vendors' strategies are for dealing
> > with UTF-16 Unicode as a datatype. In that area, the answers for
> > what a cross-platform application vendor needs to do and for how
> > JDBC drivers might abstract differences in database implementations
> > are still unclear.
> >
> > --Ken
> >
> >
> >
>
>
>



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:04 EDT