RE: IDN and Missed Normalisations

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Mon May 07 2007 - 13:19:55 CDT

  • Next message: Richard Wordingham: "Re: Uppercase ß is coming? (U+1E9E)"

    Richard Wordingham wrote:
    > The present standard for International Domain Name Processing (nameprep -
    > RFC 3491 and stringprep - RFC 3454) currently operates with four steps:
    > mapping, normalisation (NFKC), prohibition and bidi checking. Mapping
    > replaces single characters by sequences, which may be empty. It is
    > composed
    > of two elements - deletion of default ignorables, and full case-folding,
    > complicated because it is done before compatibility decomposition. (I may
    > have missed some minor wrinkles in mapping.)

    Isn't the Unicode normalization the first step to perform before performing
    mappings and deletion?

    The nameprep result strings should be identical from all canonically
    equivalent Unicode strings.

    The complication that you may have forgotten is that you must compute the
    closure of these steps. Unicode provides a few closures for the combination
    of standard normalization and standard case foldings.

    For IDN purpose, that performs additional case mappings, you need to compute
    the extra closures. Given that NFKC is one member of the transformation, the
    canonical equivalence of the nameprep result which is in normalized form
    should be guaranteed, otherwise your nameprep implementation is bogous.



    This archive was generated by hypermail 2.1.5 : Mon May 07 2007 - 13:22:10 CDT