interoperability of sorted data

From: Tex Texin (texin@progress.com)
Date: Mon Nov 15 1999 - 16:32:19 EST


I would like to compile information about the different algorithms
used for sorting data, used by different products. As many us of work
with applications built with heterogeneous components, if there
are differences in sorting behavior, then the results can be
spurious.

I know about the linguistic differences based on locale. I want
to focus on product differences here. Some products use one-pass,
some two-pass, some four-pass. Some products allow customers to
change sort weights of characters and other don't. Some allow
weights to 255, some higher.

However, where products use different algorithms for different
locales, I will need to know that.

An example of the problem that having different sort orders causes
is as follows. (It is not restricted to client-server. It can occur
with any data passed between applications.)

In a client-server environment, if the client uses one approach and
the server another, queries return in correct results. For example
and hypothetically, if the letter P sorts after the letter Z on the
client (e.g. A-Z then P), but after the O on the server, (e.g. A-O,
then P, then Q-Z), then a query requesting all
the records between A and Z will cause the server to send all
the records including those beginning with P to the client, and the
client will throw them away, and potentially discard the records after
the P as well (Q-Z) if it believes the records were already sorted
correctly.

So, this is a long way of asking for pointers to where all of the
major products document their sort algorithms.
I am interested in databases, such as SQL Server, and Oracle,
and applications (Excel, Windows, etc.) or technologies (Java).

After I compile the information I will send it back to the list.

I did a little research already, and the documentation I found was
not very specific about the algorithms and offered hand-waving
arguments about dictionary and other kinds of sorts.
So I thought I could resort to the Unicode list and hopefully some of
you can tell me what doesn't seem to be documented.

Tex

-- 
Progress Software: The #1 Embedded Database 
-------------------------------------------------------------------------------------------------------
Tex Texin                      Director, International Products
                                 
Progress Software Corp.        Voice:         +1-781-280-4271
14 Oak Park                      Fax:         +1-781-280-4949
Bedford, MA 01730  USA             texin@bedford.progress.com

http://www.progress.com http://apptivity.progress.com -------------------------------------------------------------------------------------------------------



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:56 EDT