Re: Compatibility decomposition for Hebrew and Greek final letters from Eli Zaretskii on 2015-02-19 (Unicode Mail List Archive)

From: Eli Zaretskii <eliz_at_gnu.org>
Date: Thu, 19 Feb 2015 22:17:30 +0200

> From: Philippe Verdy <verdy_p_at_wanadoo.fr>
> Date: Thu, 19 Feb 2015 20:31:07 +0100
> Cc: Julian Bradfield <jcb+unicode_at_inf.ed.ac.uk>,
> unicode Unicode Discussion <unicode_at_unicode.org>
>
> The decompositions are not needed for plain text searches, that can use the
> collation data (with the collation data, you can unify at the primary level
> differences such as capitalisation and ignore diacritics, or transform some
> base groups of letters into a single entry, or make some significant primary
> difference when there are diacritics (for example in German equating 'ae' and
> 'ä' at the primary level).

Sorry, I disagree. First, collation data is overkill for search,
since the order information is not required, so the weights are simply
wasting storage. Second, people do want to find, e.g., "²" when they
search for "2" etc. I'm not saying that they _always_ want that, but
sometimes they do. There's no reason a sophisticated text editor
shouldn't support such a feature, under user control.
_______________________________________________
Unicode mailing list
Unicode_at_unicode.org
http://unicode.org/mailman/listinfo/unicode
Received on Thu Feb 19 2015 - 14:18:49 CST

This archive was generated by hypermail 2.2.0 : Thu Feb 19 2015 - 14:18:49 CST