Re: Unicode and RFC 4690

From: Philippe Verdy (
Date: Thu Oct 05 2006 - 13:50:13 CST

  • Next message: Jefsey_Morfin: "Re: Unicode and RFC 4690"

    From: "Neil Harris" <>
    > UTR #36 and UTR #39 have a very detailed treatment of the all the issues
    > involved.
    > Notice that implementing these constraints on a per-label basis has no
    > bearing at all on script-mixing between different labels in a FQDN,
    > which is not a security problem, and that nothing in the above policy
    > need stop labels from any of a number of different individual character
    > sets from being issued in the same zone, providing care is taken to
    > block or bundle possible collisions.
    > Politics shouldn't be the issue here: individual domain operators and
    > their users should all have a common interest in preventing homograph
    > attacks, and these techniques can work effectively regardless of
    > political issues.

    One problem of this RFC is that the current format for the database of confusables supported as equivalents by a registry is NOT integrated in the DNS so that it can scale widely.

    I would better expect a format that can be integrated completely as DNS records, possibly with a new DNS record type, simple to parse, and where each DNS server may cache reliably by a reference to a authoritative DNS server maintained by the registry (or the domain administrator if this is in a private domain).

    Such files do not address the need in local subdomains, and having a single file per language will not resolve the issues regarding security and ease of deployment.

    Note that even if a TLD registry does not support IDN, support for IDN labels may be present (wanted, needed) within a subdomain for various things such as user names, product names, book titles... used as labels within a private subdomain.

    If every registry (or domain name authority) can specify its own rules regarding acceptable characters and their IDN-canonical equivalents, things would be simpler. The RFC just needs to address the required features in the IDN implementation, i.e. the implicit (non negociable) support for Unicode canonical equivalents (from which it is NOT necessary to specify the list of all possible equivalents).

    Each TLD registry or each subdomain authority should provide a default set of rules that will be applied by default in all subdomains, unless one of the registered domain contains a record referencing another rule set (which should be another domain name that specifies the complete set of rules), or records specifying overrides (for example, the support of more characters); one of the common ruleset should include the one for the default reduced ASCII-only subset (i.e. no support for IDN), and this should be specified simply by referencing the domain name of the root registry (if the root must remain ASCII-only), or a documented domain name owned by the authority managing the root (for example, or some special subdomain (for example:, where the data of the rule set is registered.

    This archive was generated by hypermail 2.1.5 : Thu Oct 05 2006 - 13:54:32 CST