Re: Standartising search for similar symbols

From: sergey (sergey-feo@yandex.ru)
Date: Sat Nov 14 2009 - 16:58:39 CST

  • Next message: Neil Harris: "Re: Standartising search for similar symbols"

    Hello, Mark!

    It is new for me that some algorithms for such searching exists :-)
    Bus as you mention this is very basic method that can't solve all problems.
    I am sure that it can't work with most of sets that i wrote in first post.

    Search for strings with cyrillic "С" and latin "C" is signicaft for example because
    this letters:
    1) have exactly same graphics;
    2) shares same key in russian and english keyboard layouts.

    I do not know any text editor that has RFC 5051 usage option or something like it.
    If information about similar symbols will be added to Unicode Character Database then
    text editors developers can notice this :-)

    Regards, Sergey

    --------------

    On Fri, 13 Nov 2009 11:20:45 -0800 (PST)
    Mark Crispin <mrc+unicode@panda.com> wrote:

    > If the text editor uses i;unicode-casemap (RFC 5051) for its search, it
    > will find both. This is because U+2212 decomposes to U+002D.
    >
    > It is certainly possible to find examples in which i;unicode-casemap won't
    > bail you out; it was intended to be a very basic first-level that is
    > simple to implemented. But at least in this example, you have what you
    > want.
    >
    > Best wishes,
    >
    > -- Mark --
    ---------------------------
    Сергей Феоктистов



    This archive was generated by hypermail 2.1.5 : Sat Nov 14 2009 - 17:03:25 CST