From: Mark Crispin (mrc+unicode@panda.com)
Date: Fri Nov 13 2009 - 13:20:45 CST
If the text editor uses i;unicode-casemap (RFC 5051) for its search, it
will find both. This is because U+2212 decomposes to U+002D.
It is certainly possible to find examples in which i;unicode-casemap won't
bail you out; it was intended to be a very basic first-level that is
simple to implemented. But at least in this example, you have what you
want.
Best wishes,
-- Mark --
On Fri, 13 Nov 2009, sergey wrote:
> Please imagine that we have big text file. At the beginning of this file
> someone wrote:
> 3-2*4
> The "-" here is U+002D.
> At the middle of file someone else wrote:
> 3−2*5
> The "−" here is U+2212.
> Now imagine that you see "3-2*4" and want to find all that means "3 minus 2"
> in the file. You ask you text editor for searching "3-2". It will
> find only "3-2*4" but not both because "-" and "−" has different
> codes in Unicode.
-- Mark --
http://panda.com/mrc
Democracy is two wolves and a sheep deciding what to eat for lunch.
Liberty is a well-armed sheep contesting the vote.
This archive was generated by hypermail 2.1.5 : Fri Nov 13 2009 - 13:22:45 CST