Re: More Permanent Faults? - Unicode 5.0 Casefolding

From: Richard Wordingham (richard.wordingham@ntlworld.com)
Date: Sat Jun 10 2006 - 05:57:57 CDT

Next message: Richard Wordingham: "Tentative Definition of Casefolding"

Previous message: Michael Everson: "RE: Glyphs for German quotation marks"
In reply to: Mark Davis: "Re: More Permanent Faults? - Unicode 5.0 Casefolding"
Next in thread: George W Gerrity: "Re: More Permanent Faults? - Unicode 5.0 Casefolding"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Mark Davis wrote on Saturday, June 10, 2006 at 1:10 AM

> C9 basically says that you should respect canonical equivalence, and you
> should be prepared for any other process to respect it. In the standard we
> supply case folding operations that do not, in themselves, require
> normalization, but in edge cases may not respect canonical equivalence.
> While we strongly encourage that all processing respect canonical
> equivalence, but recognize that for some common tasks like case folding,
> people may not want to take on the extra performance / code-complicating
> of
> adding normalization, to handle a small number of edge cases. But we also
> define forms of case folding that do, in fact, respect canonical
> equivalence.

The problem then comes with conformace requirement C20:

C20 An implementation that purports to support the default casing operations
of case conversion, case detection, and caseless mapping shall do so in
accordance with the definitions and specifications in Section 3.13, Default
Case Operations.

It seems then that the default uppercasings of <U+1FB3, U+0342> and
<U+1FB3, U+0304> are <U+1FBC, U+0342> and <U+1FBC, U+0304> and their default
casefoldings are <U+03B1, U+03B9, U+0342> and <U+03B1, U+03B9, U+0304>. Is
this the case? If so, does C20 override C9? May correct processes offering
'default full uppercasing (or casefolding) as defined by Unicode Version
x.y' produce canonically inequivalent outputs?

The issues are entirely restricted to trying to implement the default casing
functions. Producing tailored casing functions is a different issue.

The urgency arises from the imminent partial freezing of default
casefolding.

If the Unicode handling of Greek is to be improved, it may well require
locale-sensitive rules. It may be as well to declare locales as inherently
unstable - if Unicode lasts for centuries, they will be.

Richard.

Next message: Richard Wordingham: "Tentative Definition of Casefolding"
Previous message: Michael Everson: "RE: Glyphs for German quotation marks"
In reply to: Mark Davis: "Re: More Permanent Faults? - Unicode 5.0 Casefolding"
Next in thread: George W Gerrity: "Re: More Permanent Faults? - Unicode 5.0 Casefolding"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Sat Jun 10 2006 - 06:06:00 CDT