RE: ISO vs Unicode UTF-8 (was RE: UTF-8 signature in web and emai l)

From: Ayers, Mike (Mike_Ayers@bmc.com)
Date: Wed May 30 2001 - 14:43:52 EDT


> From: Peter_Constable@sil.org [mailto:Peter_Constable@sil.org]

> According to the proposal, UTF-8S and UTF-32S would not have the same
> status: they wouldn't be for interchange; they'd just be for
> representation
> internal to a given system, like UTF-EBCDIC (which, I think I
> heard, has
> not actually been implemented by IBM in any live systems).

> <quote>
> UTF-8 database server пр UTF-16 database client
>
> A SQL statement executed on the database server returns a result set
> ordered by the binary sort of the data in UTF-8, given that
> this is the
> encoding of both data and indexes in the database.
>
> A C/C++ or Java UTF-16-based client receives this result set and must
> compare it to a large collection of data stored locally in UTF-16...
> </quote>

        Okay, here's my train of thought, with hopes of enlightenment. SQL
is a standard. SQL interfaces are standard. That which goes over SQL
interfaces should therefore be standardized. SQL interfaces are external,
i.e. I can connect my SQL client to an arbitrary SQL server, and, provided I
have the credentials, I can query away. Oracle (et. al.) wants to export
UTF-8S and UTF-32S on a SQL interface. If all the preceding is correct,
then the claim that these encodings would just be for internal
representation is false.

        Better yet, here's a challenge for the proponents:

        Both a UTF-8 based client and a UTF-16 based client are connected to
the same database. Each runs a large query. Show that the use of UTF-8S or
UTF-32S will improve the performance without being used external to either
system.

        Any and all enlightenment appreciated.

/|/|ike



This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:18:17 EDT