Re: Compatibility decomposition for Hebrew and Greek final letters

From: Eli Zaretskii <eliz_at_gnu.org>
Date: Fri, 20 Feb 2015 10:13:41 +0200

> From: Philippe Verdy <verdy_p_at_wanadoo.fr>
> Date: Fri, 20 Feb 2015 04:47:52 +0100
> Cc: jcb+unicode_at_inf.ed.ac.uk, unicode Unicode Discussion <unicode_at_unicode.org>
>
> Sorry, I disagree. First, collation data is overkill for search,
> since the order information is not required, so the weights are simply
> wasting storage. Second, people do want to find, e.g., "²" when they
> search for "2" etc. I'm not saying that they _always_ want that, but
> sometimes they do. There's no reason a sophisticated text editor
> shouldn't support such a feature, under user control.
>
> The weights or the collation strings do not need to be stored. Even database
> engines or plain-text search engines on the web provide now collation
> algorithms for searching or sorting data, so that you don't need to store it in
> your tables... It is not overkill, as good implementations of collation are
> efefctively used in high-permance database servers (and many users of these
> databases do not realize that collation is effectively used.

I'm talking specifically about Emacs. Emacs provides locale-dependent
collation, but it relies on the underlying platform libraries to do
the work, it doesn't itself load the DUCET database, or anything
similar to it. By contrast, Emacs does have an efficient-storage
implementation of the UCD, and by virtue of that, accessing
decomposition data and performing normalization is at my fingertips.

So I'd like to avoid loading DUCET, and doing so just for the sake of
a few characters mentioned in this thread doesn't sound justified;
it's much easier to have a small database of additional equivalences.

> There are also good text editors implementing collation searches.

Could you mention their names, please?

Thanks.
_______________________________________________
Unicode mailing list
Unicode_at_unicode.org
http://unicode.org/mailman/listinfo/unicode
Received on Fri Feb 20 2015 - 02:14:22 CST

This archive was generated by hypermail 2.2.0 : Fri Feb 20 2015 - 02:14:22 CST