Re: Case folding

From: Richard Wordingham (richard.wordingham@ntlworld.com)
Date: Fri Jun 09 2006 - 10:04:19 CDT

Next message: Richard Wordingham: "Re: UTF-8 can be used for more than it is given credit"

Previous message: Philippe Verdy: "Re: Glyphs for German quotation marks"
In reply to: Philippe Verdy: "Re: Case folding"
Next in thread: Mike: "Re: Case folding"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Philippe Verdy wrote on Friday, June 09, 2006 at 7:34 AM

> From: "Mike" <mike-list@pobox.com>
>>> To answer this question, ask yourself what would happen if you
>>> uppercased the string "Straße" this way.

>> I think I would get the right answer, "STRASSE" (if
>> that is the "sharp S" I have learned about a few weeks
>> back).

> Wrong. The case folding of the sharp s is a sharp s. The standard case
> folding does not convert any letter to uppercase.

Is there a 'standard' case folding? There are two default case-foldings,
the simple case-folding, and the full case-folding! The simple case-folding
is as you state - the full case folding is to 'ss'. This results from the
full upper-casing being 'SS', so Mike's answer is correct. Or are you
saying that 'ﬀrench' should not match case-fold to the same as 'Ffrench'?
(Incidentally, how should we handle the locale specific titlecasing here?
It's a bit more local than simply 'en'!)

> Note that if you compare case insensitively and don't care about other
> variations (at secondary collation level or higher), you can reduce a lot
> the complexity of the algorithm and get much faster result using the
> following:
>
> toLowerCase( toUpperCase(filter(NFKD( string ))) )
>
> where the filter() function eliminates all combinining characters with
> combining class greater than zero.

It's a shame it filters out all the Tibetan vowels.

Richard.

Next message: Richard Wordingham: "Re: UTF-8 can be used for more than it is given credit"
Previous message: Philippe Verdy: "Re: Glyphs for German quotation marks"
In reply to: Philippe Verdy: "Re: Case folding"
Next in thread: Mike: "Re: Case folding"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Fri Jun 09 2006 - 10:13:01 CDT