Re: Attack vectors through Unassigned Code Points in IDN

From: Kenneth Whistler (
Date: Wed Mar 18 2009 - 13:13:40 CST

  • Next message: Shawn Steele (???): "RE: Attack vectors through Unassigned Code Points in IDN"

    Chris Weber asked:

    > In I’m reading RFC 3491 correctly, then IDNA allows for
    > unassigned code points to exist in strings and domain names.

    Not exactly. IDNA prohibits unassigned code points from
    existing in "stored strings", i.e. those domain names
    actually registered and stored in zone files. It allows
    unassigned code points to occur in "query strings".

    > This makes spoofing attacks possible when one these code
    > points don’t have associated glyphs and basically show up
    > as white space. This seems to be the case with some ranges
    > like U+115A..U+115E under.

    Side note: U+115A..U+115E are assigned now. They are assigned
    in the published Amendment 5 to 10646, and will be documented
    as part of the Unicode 5.2 release eventually.

    Of course for the published specification for IDNA, assigned
    characters are limited to Unicode 3.2, and the NamePrep spec
    requires use of tables associated with Unicode 3.2. So for
    currently deployed implementations of IDNA, U+115A..U+115E
    will continue to count as unassigned, even though systems
    may otherwise be implementing Unicode 5.1 or (eventually)
    Unicode 5.2.

    This disconnect should be addressed by the new IDNA spec.

    > My question is – was this the intended behavior of IDNA to
    > allow unassigned code points in IDN? Or is this more related
    > to a font rendering issue?

    See above. You can send a query string containing the
    NamePrep converted version of some unassigned code points
    to a resolver. But it should never resolve, because those
    unassigned code points cannot be in a stored string.

    > 7. Unassigned Code Points in Internationalized Domain Names
    > If the processing in [IDNA] specifies that a list of unassigned code
    > points be used, the system uses table A.1 from [STRINGPREP] as its
    > list of unassigned code points.

    That text (from NAMEPREP) is merely specifying that for IDNA,
    the definition of unassigned code points is provided by
    table A.1 from STRINGPREP.

    The relevant text you need to connect that to is found in
    RFC 3454 [STRINGPREP]:

    7. Unassigned Code Points in Stringprep Profiles

    ... Stored strings using the profile MUST NOT contain
    any unassigned code points. Queries for matching strings
    MAY contain unassigned code points. ...

    And then IDNA itself, RFC 3490:

    6.3 DNS servers

    Domain names stored in zones follow the rules for "stored
    strings" from [STRINGPREP].


    This archive was generated by hypermail 2.1.5 : Wed Mar 18 2009 - 13:15:50 CST