Case insensitive comparisions

From: david@oz.com
Date: Tue Mar 28 2000 - 11:33:38 EST


I was stepping through some code that did case insensitive comparison for
unicode, and noticed that the heart of the function converted each
character to lowercase before comparing them. If they matched, they were
considered equal, otherwise not.

Now I'm wondering on this issue, since I am dealing with data from mixed
locales. For the turkish dotted I (Ï), this means for example that (HÏS)
will match HIS, which if HIS comes from a non-Turkish environment must be
considered a false positive result.

Now it seems to me that a false positive result is probably OK for my
purposes, but a false negative result is not, so I am wondering, are there
any cases where this would return a false negative result? I would think
that if there are any two unicode letters that transform into the same
upper case value, this would be the case.

Are there any such cases, or am I totally off base?

With thanks in advance for your help,
David H. Brandt
OZ.COM



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:00 EDT