Re: Default case algorithms from Philippe Verdy on 2014-06-24 (Unicode Mail List Archive)

From: Philippe Verdy <verdy_p_at_wanadoo.fr>
Date: Tue, 24 Jun 2014 18:03:48 +0200

2014-06-24 17:07 GMT+02:00 Markus Scherer <markus.icu_at_gmail.com>:

> On Tue, Jun 24, 2014 at 4:56 PM, Daniel Bünzli <
> daniel.buenzli_at_erratique.ch> wrote:
>
>> Does an algorithm that simply applies R1 *regardless of context*
>> constitute a default case algorithm or not ? I.e. does simply mapping each
>> character C in a string using Uppercase_Mapping (C) (e.g. as exposed by the
>> XML UCD) constitute a default case conversion as mandated by the standard ?
>>
>
> It implements simple uppercasing but not full uppercasing.
> It misses simple, common things like ß -> SS (which is neither
> language-dependent nor context-sensitive).
>

Bot so simple; may be it is SS for modern German, but Czech would map it to
SZ, and historically that letter is a ligature of SZ (including in old
German texts where that ligature was used), along with many other ligatures
in medieval texts.

If texts were printed in Fraktur style, you always have an ambiguity about
if you should even use ß as a single letter or if you should better encoded
separate letters (without even needing to encode any ligature hint because
ligatures are everywhere in the text in its original form they are inherent
of the script style (you would use hints only for variants of these
ligatures or infrequent absences of a ligature).

_______________________________________________
Unicode mailing list
Unicode_at_unicode.org
http://unicode.org/mailman/listinfo/unicode
Received on Tue Jun 24 2014 - 11:05:24 CDT

This archive was generated by hypermail 2.2.0 : Tue Jun 24 2014 - 11:05:24 CDT