Re: First posting to list: Unicode.org: unicode - punycode converter tool?

From: Mark Davis ☕ (mark@macchiato.com)
Date: Sat Oct 30 2010 - 13:52:17 CDT

  • Next message: JP Blankert (thuis & PC based): "ss and ß"

    Whether or not it was a good idea to have ß in domain names (post-mapping)
    is moot at this point, given IDNA2008. The key will be to manage the
    transition well. For many years, client software (browsers, etc.) will be
    converting ß to ss in domain names. To prevent serious problems, it's
    recommended that any registrar that allows ß to do the following:

    If someone attempts to register a label with any ß, check if the
    corresponding label with all ss's is registered.

       1. If so, reject the registration unless the registrant is precisely the
          same.
          2. If not, automatically give the registrant both labels.

    That way both new and old browsers will continue to work, and security and
    operability problems will be avoided: (1) avoids security problems, while
    (2) gives correct results for both new and old client software.

    If client software knows that a registrar follows this policy, then it can
    then allow ß to be unmapped for that registrar.

    The same goes each of the 4 transition characters: *ß, **ς, *and the two*
     joiners.*

    Mark

    *— Il meglio è l’inimico del bene —*

    On Sat, Oct 30, 2010 at 04:57, "Martin J. Dürst" <duerst@it.aoyama.ac.jp>wrote:

    > On 2010/10/30 9:17, Markus Scherer wrote:
    >
    >> On Fri, Oct 29, 2010 at 3:57 PM, JP Blankert (thuis& PC based)<
    >> jpblankert@zonnet.nl> wrote:
    >>
    >> Dear unicode.org interested,
    >>>
    >>> I discovered at least 1 flaw in the converter tools I used so far (as
    >>> Verisign's IDN to punycode converter): none of the ones I checkes
    >>> recognises
    >>> the German character
    >>>
    >>> ß
    >>>
    >>> (the sz, as from 'Straße' )
    >>>
    >>> correctly, the sign is always dissolved in ss.
    >>>
    >>>
    >> This is standard IDNA2003 behavior.
    >>
    >
    > Yes.
    >
    > It is usually desirable
    >>
    >
    > It is desirable in searching, but it wasn't desirable in domain names. The
    > reason it got into IDNA2003 is because the IETF was looking for data to do
    > case mapping beyond ASCII, and the data available from the Unicode
    > consortium included the 'ß' -> ss mapping, and the IETF didn't want to
    > change it because they feared that might start all kinds of discussions on
    > all kinds of (essentially unrelated) issues.
    >
    >
    > because a) many
    >> German speakers are unsure about when exactly to use ß vs. ss,
    >>
    >
    > Yes, but for many names, it's either one or the other. Essentially, no
    > rules.
    >
    >
    > b) the
    >> spelling reform a few years ago changed the rules,
    >>
    >
    > Yes. They got way easier and more straightforward.
    >
    >
    > and c) Switzerland does
    >> not use ß at all in German.
    >>
    >
    > Yes. But that's no reason to take it away from those who use it.
    > (at least myself being Swiss I don't think so)
    >
    >
    > This means that for most purposes it is
    >> counter-productive (and can be a security risk) to distinguish ß and ss.
    >>
    >
    > Well, it can be a security risk to distinguish between 'i' and 'l' and '1',
    > and so on, and nevertheless, it's being done for good reasons all the time.
    >
    >
    > IDNA2008, an incompatible update, by itself does not map characters.
    >>
    >
    > What's more important, IDNA2008 allows the 'ß' as is.
    >
    >
    > UTS #46
    >> provides a compatibility bridge for both IDNA2003 and IDNA2008, and the ß
    >> behavior is an option there.
    >>
    >
    > Yes. The basic idea in TR #46 is that in a first phase, 'ß' is mapped to
    > 'ss' for lookup, to give registries with German clients a chance to their
    > clients to register true 'ß' where necessary. After that, the mapping can be
    > dropped, so as in the (somewhat distant) future to allow for cases where a
    > name with 'ß' and a name with 'ss' are resolved differently.
    >
    > Regards, Martin.
    >
    >
    > --
    > #-# Martin J. Dürst, Professor, Aoyama Gakuin University
    > #-# http://www.sw.it.aoyama.ac.jp mailto:duerst@it.aoyama.ac.jp
    >
    >



    This archive was generated by hypermail 2.1.5 : Sat Oct 30 2010 - 13:55:57 CDT