RE: What is this "case folding"?

From: Asmus Freytag (asmusf@ix.netcom.com)
Date: Tue Jul 11 2000 - 16:36:32 EDT


At 02:43 AM 7/11/00 -0800, Marco.Cimarosti@icl.com wrote:
>To achieve case insensitive (or however "loose") comparison, an alternative
>to hidden case folding is using collation tables, that assign one or more
>levels of "weight" keys to each character. One good example of this is
>UTR#10 (http://www.unicode.org/unicode/reports/tr10/).

TR#21 Case Mappings provides a link to a Case Folding Table.
http://www.unicode.org/unicode/reports/tr21

This one is different from the 1:1 case relations that are defined in
UnicodeData.txt on ftp://ftp.unicode.org/Public/UNIDATA and also different
from the full fledged, occasionally locale dependent case mappings in
SpecialCasing.txt at the same location.

There are other 'loose matches' that are sometimes useful in searching and
that are not covered by case folding. In Japanese, folding the two styles
of Kana (Katakana and Hiragana) can sometimes be useful. Also, folding the
width of half and full width characters may be desired.

Some, but not all, of these foldings are lumped together in Normalization
Form KC, see http://www.unicode.org/unicode/reports/tr15
However, that normalization form is all-or-nothing and it includes a number
of less desirable foldings as well.

A./



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:05 EDT