RE: Case mapping of dotless lowercase letters

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Tue Dec 16 2003 - 19:32:02 EST

  • Next message: Kenneth Whistler: "Re: Case mapping of dotless lowercase letters"

    Peter Kirk writes:
    > If it is the client software (browser etc) which resolves the casing,
    > then how it resolves it is essentially a local matter which doesn't need
    > to be standardised. But my recommendation would be that the mapping
    > followed the local language context, i.e. in general the system locale
    > except where overridden by language markup in the local context e.g.
    > when the URL is embedded in a document. That is, "I" would map to "i",
    > unless the locale or markup language is tr or az in which case it would
    > map to dotless i. (There are actually a few other language orthographies
    > which use Turkic casing.) The alternative of using the Turkic mapping
    > for .tr and .az domains is possible but seems less desirable to me.
    >
    > If the casing is resolved by the nameserver, there is no alternative to
    > using the Turkic mapping only for .tr and .az domains.

    Turkic case mappings are not usable in DNS and not even in IDNA, simply
    because all legacy ASCII names must continue to resolve ASCII 'I'
    identically with ASCII 'i' and not 'i' (encoded with Punycode). This is
    needed for upwards compatibility.

    So even localized browsers will need to forbid mapping 'i' as if it was
    'I', and IDNA names containing 'i' cannot be fully converted to uppercase,
    even with Full case mappings, which will need to keep the lowercase
    letter. This will be true also for .tr' and '.az' registries, unless these
    registries adopt a policy requiring the reservation of domain names in
    bundles. If this occurs, it will be the registry which will map domain names
    containing 'i'=='I' identically to domain names containing either a dotless
    lowercase i. For the case of the dotted uppercase I, separate allocation is
    still possible, but it would be too easily spoofable as they can be too
    easily entered on Turkic keyboards to spoof the soft-dotted lowercase i.

    So I doubt that .tr and .az registry will ever adopt a distinction between
    dotted and undotted i in domain names, but they will ensure that by adding
    bundle reservation policies if they ever implement IDNA. I doubt that
    Turkish and Azeri registries will resolve names in bundles with dotless-i
    or dotted-I, as it would require server-side dynamic DNS capabilities, which
    would also mean scalability problems (the .fr registry has already rejected
    the idea of resolving names reserved in bundles because of scalability
    problems with some bundles which may have thousands of equivalents and would
    be difficult to support in fast static DNS servers: only one "canonical"
    name in the bundle will be resolved on DNS servers, the other names being
    left reserved, until a standard solution is found to allow such resolution
    in clients of these registries, using the bundle equivalence rules defined
    by the specific IDNA bundle profile of each registry).

    __________________________________________________________________
    << ella for Spam Control >> has removed Spam messages and set aside
    Newsletters for me
    You can use it too - and it's FREE! http://www.ellaforspam.com





    This archive was generated by hypermail 2.1.5 : Tue Dec 16 2003 - 20:16:51 EST