Re: Compliant Tailoring of Normalisation for the Unicode Collation Algorithm from Richard Wordingham on 2012-05-18 (Unicode Mail List Archive)

From: Richard Wordingham <richard.wordingham_at_ntlworld.com>
Date: Fri, 18 May 2012 21:58:51 +0100

On Fri, 18 May 2012 09:51:34 -0700
Markus Scherer <markus.icu_at_gmail.com> wrote:

> There is nothing that requires us to get correct results *without
> normalization* for all FCD strings or any other particular input
> conditions (except NFD input).

So long as you don't claim conformance to the CLDR collation
definitions. If you do, a lot depends on how one interprets the
definition of normalisation settings given in UTS#35 'Unicode Locale
Data Markup Language' Revision 25 (Version 21.0.1) Section 5.1.4.3:

"If on, then the normal [UCA] algorithm is used. If off, then all
strings that are in [FCD] will sort correctly, but others will not
necessarily sort correctly. So should only be set off if the the
strings to be compared are in FCD."

This is stronger than the corresponding description in the UCA. I
assume that the 'will' is there because it is what a user is allowed to
expect - so long as he ignores the dictum that 'all software has bugs'.

Richard.
Received on Fri May 18 2012 - 16:04:18 CDT

This archive was generated by hypermail 2.2.0 : Fri May 18 2012 - 16:04:20 CDT