Re: CaseFirst and CaseLevel Tailorings of UCA and LDML

From: Richard Wordingham <richard.wordingham_at_ntlworld.com>
Date: Tue, 22 May 2012 09:09:55 +0100

On Mon, 21 May 2012 17:07:33 -0700
Markus Scherer <markus.icu_at_gmail.com> wrote:

> In principle, it's straightforward: Lowercase and uppercase follow
> Unicode (UCD) case properties. We distinguish an intermediate "mixed
> case" for titlecase characters and mixed-case contractions. I believe
> we also distinguish small/normal Kana as lowercase/uppercase. I can
> dig up the ICU code that computes the collation case bits for a
> string.

Is this code in ICU 4.4.2 (the version for the Linux I run), or should
I be looking at ICU 49?

> I don't know whether CLDR/LDML should require all of the details, but
> there should at least be informative documentation.

If they are to define collation, they have to define how the order
results from the tailoring. Of course, it can be done by reference,
but while saying 'as in UCA' is entirely appropriate where the UCA is
adequately defined (some tailorings clearly are not, and work is under
way to fix some of these shortfalls), I am uneasy at 'as in ICU'.

Richard.
Received on Tue May 22 2012 - 03:11:49 CDT

This archive was generated by hypermail 2.2.0 : Tue May 22 2012 - 03:11:50 CDT