Security Issues

From: Mark Davis (
Date: Wed Mar 23 2005 - 12:08:25 CST

  • Next message: Donald Z. Osborn: "RtL RSS & Unicode"

    There are various groups actively pursuing the security issues involved in
    international domain names. The actual scope of the work is larger, since it
    may affect core standards used for other types of identifiers, such as
    networked filenames, and so on. Some of the possible approaches to the
    security issues are to limit the allowable characters in some ways. There is
    a chart as part of the current draft of TR36 that shows a breakdown of
    currently-allowed characters: . (Remember that there
    is a current restriction to U3.2 characters.)

    Some issues that I'd like broader feedback on:

    A. Currently, compatibility decomposables are mapped to their NFKC form. So
    if you type in a half-width katakana form, it will map to the fullwidth
    form. There is a proposal to simply forbid compatibility decomposable
    instead of mapping them. Is this acceptable (eg in Japan)?

    B. There is a proposal to restrict the characters to "LDH" characters
    (letters, digits, and hyphen). The closest thing we have in Unicode to that
    is the XID_Continue property, so the above chart separates characters out on
    that basis. The question is, are there any characters classed there under
    "Non-ID" that really should be allowed? (Example: U+0404 ( ״ ) GERSHAYIM?)

    B1. Should all of the characters permitted in words in qualify?

    C. Characters with no uppercase in bicameral scripts may be suspect, and
    disallowed or flagged. Which of these really need to be allowed? (Example:
    U+04C0 ( Ӏ ) PALOCHKA?)

    D. The main focus is on characters in modern use. Is there any data that
    would let us separate out non-modern-use characters, at least for flagging?


    This archive was generated by hypermail 2.1.5 : Wed Mar 23 2005 - 12:10:02 CST