Re: IDN problem.... :(

From: Doug Ewell (
Date: Fri Feb 11 2005 - 10:28:22 CST

    Marcin 'Qrczak' Kowalczyk <qrczak at knm dot org dot pl> wrote:

    > Don't look for wrong patterns. Ensure that there is a good pattern
    > instead. In particular characters not belonging to any regular writing
    > system, like arrows or half-wide Latin letters, are rejected.

    IDN strings go through a process called "nameprep" before being encoded
    in Punycode. Nameprep is a combination of NFKC, case folding, removal of
    control characters and space characters, etc. This means it should not
    ever be possible to create a domain name like pay​ (with ZWSP) or (with fullwidth Latin).

    What nameprep explicitly does *not* do is attempt to create a mapping
    between Latin p, Greek ρ, Cyrillic р, Cherokee Ꮲ, Deseret 𐑁, and so
    forth. This is just too slippery and font-dependent. I've noticed
    through the last few years that despite all the calls for
    visual-similarity mapping tables, nobody actually volunteers to undertake
    this project. Probably they stop as soon as they encounter the
    "semi-confusables" like υ and к and realize it's not as simple an issue
    as they thought.

    I don't know about arrows, but it seems unlikely that these would be
    useful for spoofing.

    -Doug Ewell
     Fullerton, California

