Re: Compatibility decomposition for Hebrew and Greek final letters

From: Eli Zaretskii <>
Date: Fri, 20 Feb 2015 10:04:32 +0200

> Date: Thu, 19 Feb 2015 22:02:57 +0000
> From: Richard Wordingham <>
> > First, collation data is overkill for search,
> > since the order information is not required, so the weights are simply
> > wasting storage.
> The big waste is not in text-dependent storage, but in the
> processing for search orders that bear little relationship to
> alphabetical order.

Sorry, I don't think I follow: what is "processing for search orders"
to which you allude here?

> > Second, people do want to find, e.g., "" when they
> > search for "2" etc. I'm not saying that they _always_ want that, but
> > sometimes they do. There's no reason a sophisticated text editor
> > shouldn't support such a feature, under user control.
> I think one problem is disbelief in the existence of enough
> sophisticated users to matter. I gather it can be quite hard to obtain
> a Swedish interface for editing Thai.

I'm not talking about localized features, like for "" to match "aa"
in Danish locales. I'm talking about matching strings that are
equivalent under canonical and compatibility decompositions.

As for user sophistication, AFAIR, Microsoft Word finds "" when you
search for "2" by default, so it sounds like Word considers all users
sophisticated enough for that. I think that's a solid enough
precedent to follow.
Unicode mailing list
Received on Fri Feb 20 2015 - 02:05:15 CST

This archive was generated by hypermail 2.2.0 : Fri Feb 20 2015 - 02:05:16 CST