Re: Standartising search for similar symbols

From: Neil Harris (
Date: Sat Nov 14 2009 - 18:59:45 CST

    sergey wrote:
    > Hello, Mark!
    > It is new for me that some algorithms for such searching exists :-)
    > Bus as you mention this is very basic method that can't solve all problems.
    > I am sure that it can't work with most of sets that i wrote in first post.
    > Search for strings with cyrillic "ะก" and latin "C" is signicaft for example because
    > this letters:
    > 1) have exactly same graphics;
    > 2) shares same key in russian and english keyboard layouts.
    > I do not know any text editor that has RFC 5051 usage option or something like it.
    > If information about similar symbols will be added to Unicode Character Database then
    > text editors developers can notice this :-)
    > Regards, Sergey

    You might want to take a look at Unicode Technical Report 39, , which links to a huge
    machine-readable list of "visually confusable" characters as part of its
    supporting material.

    -- Neil

