Re: [idn] IDN spoofing

From: Hans Aberg (haberg@math.su.se)
Date: Sat Feb 19 2005 - 13:05:21 CST

  • Next message: Gregg Reynolds: "orthographies (was: Re: Capitalization (Was: 03F3 j Greek Letter yot)"

    At 21:54 -0800 2005/02/18, Doug Ewell wrote:
    >Keld Jrn Simonsen <keld at dkuug dot dk> wrote:
    >
    >> Of cause we should minimize the risks for internet users of being
    >> mislead. This could be done by equializing similar characters,
    >> like Latin, Cyrillic and greek A, 0 and O, 1 and 1 etc, so that no
    >> visual misleading should be possible.
    >
    >OK, now fill in the "et cetera," now that you've got the obvious ones
    >out of the way.
    >
    >As long as Erik has already mentioned it (thank you very much), see my
    >post from 3 years ago to see how this task quickly goes from simple to
    >tricky to impossible:
    >
    >http://ops.ietf.org/lists/idn/idn.2002/msg00498.html

    If one does it that way, one quickly gets into trouble. But one defines a
    map, which merges some characters for separating IDN's, while retaining the
    original Unicode character set on the user level on the input. Take a
    character set C, which might be a subset of Unicode, and send the Unicode
    characters (or a suitable subset thereof) into the set of finite sequence of
    C. Two IDN's will be declared equal if mapped to the same character
    sequence. This map is only used define which IDN's are viewed as equal. But
    one is still free to use whatever Unicode sequences one wants. The map is
    used to define an equivalence relation on the set of Unicode character
    sequences, but does not in itself affect which Unicode sequences which are
    admissible.

    This is in fact a common math method: If two objects need to be separated,
    define a map which separates them. If many objects need to be separated, one
    may use several maps, used as a set product codomain function, until the
    properties one wants to be captured are separated. So if one defines a map
    above which in other circumstances fail to separate the characters the way
    one wants to, define another map.

      Hans Aberg



    This archive was generated by hypermail 2.1.5 : Sat Feb 19 2005 - 13:35:06 CST