Re: Unicode and RFC 4690

From: Neil Harris (neil@tonal.clara.co.uk)
Date: Thu Oct 05 2006 - 11:53:37 CST

Next message: Magda Danish (Unicode): "The Unicode Standard, Version 5.0 preorder period ends in 10 days!"

Previous message: Jim Melton: "Re: ISO/IEC 10646 and ISO/IEC 14651 freely available"
In reply to: Stephane Bortzmeyer: "Re: Unicode and RFC 4690"
Next in thread: Philippe Verdy: "Re: Unicode and RFC 4690"
Reply: Philippe Verdy: "Re: Unicode and RFC 4690"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Stephane Bortzmeyer wrote:
> On Thu, Oct 05, 2006 at 09:32:08AM +0200,
> Philippe Verdy <verdy_p@wanadoo.fr> wrote
> a message of 10 lines which said:
>
>
>> * avoiding labels using multiple scripts and informing users when an
>> IDN label may contain confusable characters. This is for the
>> immediate client-side need.
>>
>
> And it is a terrible idea because, in many countries, Latin letters are
> used together with the local script, at least in the computing domain
> (Russia is a good example).
>
>
>> * developing a standard within the DNS that allows each DNS server
>> to specify which set of non confusable characters it accepts for
>> registration as subdomain names.
>>
>
> Warning: registration is not done at the DNS server but at the
> registry system. There are much less registries than DNS servers so
> the need of a standard is less obvious.
>
> Otherwise, there *is* a standard to express the list of authorized
> characters (RFC 4290 and I attach a table at this syntax for the french
> language).
>
> It does not address confusability issues because the entire area
> is... confuse and has no solution. It is just a way for ICANN to step
> in the registration policies of TLDs.
>

As RFC 4690 acknowledges, there can never be a perfect solution to the
confusables problem; but it does mean we shouldn't try to address it. As
it happens, we can do rather well at reducing the possibilities for
spoofing to very low levels.

Character repertoires for DNS labels, combined with script-mixing rules
can _completely eliminate_ the possibility of mixed-script confusables,
as well as vastly reducing the opportunities for within-script
confusables. Doing this reduces the combinatorial opportunities for
spoofing generation by many orders of magnitude, and similarly
simplifies the task of constructing confusables lists, greatly
increasing the chances of successful blocking of the remaining
single-script and whole-script confusables by other means such as
homograph lists.

UTR #36 and UTR #39 have a very detailed treatment of the all the issues
involved.

Notice that implementing these constraints on a per-label basis has no
bearing at all on script-mixing between different labels in a FQDN,
which is not a security problem, and that nothing in the above policy
need stop labels from any of a number of different individual character
sets from being issued in the same zone, providing care is taken to
block or bundle possible collisions.

Politics shouldn't be the issue here: individual domain operators and
their users should all have a common interest in preventing homograph
attacks, and these techniques can work effectively regardless of
political issues.

-- Neil

Next message: Magda Danish (Unicode): "The Unicode Standard, Version 5.0 preorder period ends in 10 days!"
Previous message: Jim Melton: "Re: ISO/IEC 10646 and ISO/IEC 14651 freely available"
In reply to: Stephane Bortzmeyer: "Re: Unicode and RFC 4690"
Next in thread: Philippe Verdy: "Re: Unicode and RFC 4690"
Reply: Philippe Verdy: "Re: Unicode and RFC 4690"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Thu Oct 05 2006 - 11:55:38 CST