L2/06-340 Network Working Group J. Klensin, Ed. Internet-Draft October 16, 2006 Intended status: Informational Expires: April 19, 2007 Proposed Issues and Changes for IDNA - An Overview draft-klensin-idnabis-issues-00.txt Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on April 19, 2007. Copyright Notice Copyright (C) The Internet Society (2006). Abstract A recent IAB report identified issues that have been raised with Internationalized Domain Names (IDNs) some of which require tuning of the existing protocols and the tables on which they depend. Based on intensive discussion by an informal design team, this document further explains some of the issues that have been encountered and provides explanatory material for some of the proposals that are being made. Klensin Expires April 19, 2007 [Page 1] Internet-Draft IDNAbis Issues October 2006 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1. Context and Overview . . . . . . . . . . . . . . . . . . . 3 1.2. Discussion Forum . . . . . . . . . . . . . . . . . . . . . 3 1.3. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 2. The IDNA Model . . . . . . . . . . . . . . . . . . . . . . . . 4 2.1. Registration of IDNs . . . . . . . . . . . . . . . . . . . 4 2.1.1. Proposed label . . . . . . . . . . . . . . . . . . . . 4 2.1.2. Conversion to Unicode . . . . . . . . . . . . . . . . 4 2.1.3. Permitted Character Identification . . . . . . . . . . 5 2.1.4. Stringprep Mappings . . . . . . . . . . . . . . . . . 5 2.1.5. Post-Stringprep Character String Checking and Processing . . . . . . . . . . . . . . . . . . . . . . 6 2.1.6. Registry Restrictions . . . . . . . . . . . . . . . . 6 2.1.7. Punycode Conversion . . . . . . . . . . . . . . . . . 7 2.1.8. Insertion in the Zone . . . . . . . . . . . . . . . . 7 2.2. Domain Name Resolution (Lookup) . . . . . . . . . . . . . 7 2.2.1. User input . . . . . . . . . . . . . . . . . . . . . . 7 2.2.2. Conversion to Unicode . . . . . . . . . . . . . . . . 7 2.2.3. Pre-Nameprep Validation and Character List Testing . . 7 2.2.4. Stringprep Processing . . . . . . . . . . . . . . . . 7 2.2.5. Post-Nameprep Processing . . . . . . . . . . . . . . . 8 2.2.6. Punycode Conversion . . . . . . . . . . . . . . . . . 8 2.2.7. Name Resolution . . . . . . . . . . . . . . . . . . . 8 3. IDNA200x Document List . . . . . . . . . . . . . . . . . . . . 8 4. Permitted Characters: An inclusion list . . . . . . . . . . . 8 5. The Question of Prefix Changes . . . . . . . . . . . . . . . . 9 5.1. Conditions requiring a prefix change . . . . . . . . . . . 9 5.2. Conditions not requiring a prefix change . . . . . . . . . 10 6. Stringprep Changes and Compatibility . . . . . . . . . . . . . 10 7. Display and Network order . . . . . . . . . . . . . . . . . . 11 8. The Ligature and Digraph Problem . . . . . . . . . . . . . . . 12 9. Right-to-left text . . . . . . . . . . . . . . . . . . . . . . 13 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 14 11. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 14 12. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 13. Security Considerations . . . . . . . . . . . . . . . . . . . 14 14. References . . . . . . . . . . . . . . . . . . . . . . . . . . 15 14.1. Normative References . . . . . . . . . . . . . . . . . . . 15 14.2. Informative References . . . . . . . . . . . . . . . . . . 16 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 16 Intellectual Property and Copyright Statements . . . . . . . . . . 18 Klensin Expires April 19, 2007 [Page 2] Internet-Draft IDNAbis Issues October 2006 1. Introduction 1.1. Context and Overview A recent IAB report identified issues that have been raised with Internationalized Domain Names (IDNs) and the associated standards. Those standards are known as Internationalized Domain Names in Applications (IDNA), taken from the name of the highest level standard within that group (see Section 1.3). Based on discussion of those issues and their impact, some of these standards now require tuning the existing protocols and the tables on which they depend. This document further explains, based on the results of some intensive discussions by an informal design team, some of the issues that have been encountered. It also provides explanatory material for some of the proposals that are being made. Explanatory material for other proposals will appear with the associated documents. This document begins with a discussion of the IDNA model and the general differences in strategy between the original version of IDNA and the proposed new version, then continues with a description of specific changes that are needed. [[anchor3: This initial draft is very preliminary and contains significant omissions. Some, but not all, of these are identified by explicit placeholders similar to this one.]] 1.2. Discussion Forum This work is being discussed on the mailing list idn-update@alvestrand.no 1.3. Terminology This document uses the term "IDNA2003" to refer to the set of standards that make up and support the version of IDNA published in 2003, i.e., [RFC3490], [RFC3491], [RFC3492], and [RFC3454]. The term "IDNA200x" is used to refer to a possible new version of IDNA without specifying which particular documents would be impacted. While more common IETF usage might refer to the successor document(s) as "IDNAbis", this document uses that term, and similar ones, to refer to successors to the individual documents, e.g., "IDNAbis" is a synonym for the specific successor to RFC3490, or "RFC3490bis". See also Section 3. Protocols in the IDNA group such as RFC 3454, RFC 3491 and RFC 3492 are referred to by their popular names of "Stringprep", "Nameprep", and "Punycode", respectively. Klensin Expires April 19, 2007 [Page 3] Internet-Draft IDNAbis Issues October 2006 The term "Unicode" in this document refers to Unicode 3.2 [Unicode32] when it is used in the context of IDNA2003 and to Unicode 5.0 [Unicode50] in the context of IDNA200x. For the purposes of this document -- i.e., general explanation and issues that do not address specific code points or blocks -- Unicode 3.2, Unicode 4.0 [Unicode40], and Unicode 5.0 are essentially equivalent. 2. The IDNA Model IDNA is a client-side protocol, i.e., almost all of the processing is performed by the client. The strings that appear in, and are resolved by, the DNS consist entirely of ASCII characters, conforming to the traditional rules for the naming of hosts, and consisting of only ASCII letters, digits, and hyphens. This approach permits IDNA to be deployed without modifications to the DNS itself which, in turn, avoids having to upgrade the entire Internet at once to support IDNs and the unknown risks of DNS changes to deployed systems. IDNA has the following logical flow in domain name registration and resolution. The IDNA2003 specification explicitly includes the equivalents of the steps in Section 2.1.3, Section 2.1.4, Section 2.1.5, and Section 2.1.7. The omission of an explicit discussion of the other steps has been one source of confusion. Another source has been definition of IDNA2003 as an explicit algorithm, expressed partially in prose and partially in pseudocode. The steps below conform to more traditional IETF practice; the functions are specified, rather than algorithm. The breakdown into steps is for clarity of explanation; any implementation that produces the same result with the same inputs is conforming. 2.1. Registration of IDNs 2.1.1. Proposed label The registrant submits a request for an IDN, representing it in the local character coding used by the operating system. This string is typically produced by keyboard entry and converted to the local character set by the keyboard driver software. [[anchor7: JcK: are we sure 'keyboard driver' is going to make sense to the audience. Certainly it is ok for the IETF part.]] 2.1.2. Conversion to Unicode Some system routine, or a localized front-end to the IDNA process, converts the proposed label to a Unicode string. This conversion is obviously trivial in a Unicode-native system but may involve some complexity in one that is not, especially if the characters of the Klensin Expires April 19, 2007 [Page 4] Internet-Draft IDNAbis Issues October 2006 local character set do not map exactly and unambiguously onto Unicode characters. Depending on the system involved, the major difficulty may not lie in the mapping but in accurately identifying the incoming character set and then applying the correct conversion routine. 2.1.3. Permitted Character Identification The Unicode string is examined to prohibit characters that IDNA does not permit in input. IDNA200x uses an inclusion-based approach, i.e., a list of characters that are permitted, rather than the exclusion-based approach of IDNA2003 (see Section 4). Under IDNA2003, the list of excluded characters is quite limited because the model was to permit almost all Unicode characters to be used as input with many of them mapped into others. There is now general consensus that this exclusion-based model was a mistake and should be replaced, in IDNA200x, by a system that lists only those characters that are permitted and does much less mapping. Under the proposed IDNA200x, the string in Unicode form will be rejected if it contains characters that are not on the list of characters acceptable as IDNA input. [[anchor8: Examples of impacted characters needed.]] 2.1.4. Stringprep Mappings In the model of IDNA200x, Nameprep and Stringprep will be respecified to depend on Unicode properties, rather than on explicit character lists that are dependent on Unicode version. This change in definition does not change the functional model of IDNA processing (or of Stringprep-based processing more generally), but conceptually turns it into the clear set of steps described here and localizes dependencies on Unicode definitions and properties. 2.1.4.1. Normalization The filtered string is then normalized (a Unicode concept, see any version of the Unicode Standard) to make string comparison possible even though some strings can be represented in several different ways in Unicode. In IDNA2003, the normalization method specified in Stringprep and invoked by Nameprep is based on Unicode method NFKC [Unicode-USX15]. The FC_NFKC_Closure property [FC-NFKC] is applied to facilitate subsequent case-folding. For IDNA200x, the new Stable NFKC method is used as a base to facilitate migration to future versions of Unicode but, because many of the characters permitted and then mapped to others in IDNA2003 are not permitted by IDNA200x (since most characters that would be mapped to others by compatibility equivalences are prohibited), the normalization Klensin Expires April 19, 2007 [Page 5] Internet-Draft IDNAbis Issues October 2006 operation is less extensive. 2.1.4.2. Case-folding The normalized string is then case-mapped for scripts that make case distinctions similar to those of Greek to permit approximating the ASCII-case matching applied on name resolution in the DNS. Strictly speaking, case-folding starts with the normalization process above, then strings are case-folded, then they are normalized again. The application of the "FC_NFKC_Closure" property above simplifies this process in practice. [[anchor11: Examples of impacted characters needed.]] 2.1.5. Post-Stringprep Character String Checking and Processing All characters output from the step above are then verified for the permissibility for IDNA, i.e., presence in the table of included characters (see Section 4). Additional transformations that do not occur as the result of the steps above may be specified at this point by IDNA200x. [[anchor12: Examples of impacted characters needed.]] 2.1.6. Registry Restrictions Registries at all levels of the DNS, not just the top level, are expected to establish policies about the labels that may be registered, and for the processes associated with that action. Such restrictions have always existed in the DNS and have always been applied at registration time, with the most notable example being enforcement of the hostname (LDH) convention itself. For IDNs, the restrictions to be applied are not an IETF matter except insofar as they derive from restrictions imposed by application protocols (e.g., email has always required a more restricted syntax for domain names than the restrictions of the DNS itself). Because these are restrictions on what can be registered, it is not generally necessary that they be global. If a name is not found on resolution, it is not relevant whether it could have been registered; only that it was not registered. Registry restrictions might include prohibition of mixed-script labels, or restrictions on labels permitted in a zone if certain other labels are already present (See [RFC3743] and [RFC4290] for discussion of some of the methods that have been applied by some registries). The various sets of ICANN IDN Guidelines [ICANN-Guidelines] also suggest restrictions that might sensibly be imposed. The string produced by the above steps is checked and processed as Klensin Expires April 19, 2007 [Page 6] Internet-Draft IDNAbis Issues October 2006 appropriate to local registry restrictions. This may result in the rejection of some labels or the application of special restrictions to others. [[anchor13: Examples of impacted characters needed.]] 2.1.7. Punycode Conversion The domain name label resulting from the processes above is converted to its Punycode encoding (i.e., the "xn--..." form). Punycode is not changed in IDNA200x. 2.1.8. Insertion in the Zone The Punycode-encoded string is then registered in the DNS by insertion into a zone. 2.2. Domain Name Resolution (Lookup) 2.2.1. User input The user supplies a string in the local character set, typically by typing it or clicking on a URI or IRI. 2.2.2. Conversion to Unicode The local character set, character coding conventions, and, as necessary, display and presentation conventions, are converted to Unicode, paralleling the process above. 2.2.3. Pre-Nameprep Validation and Character List Testing Again in parallel to the above, the Unicode string is checked to verify that all characters that appear in it are valid for IDNA input. As discussed in Section 4, this check should probably be more liberal than that of Section 2.1.4: characters that fall into "pending", "possibly later", or "unassigned codepoint" categories in the inclusion tables should probably not lead to label rejection at this point. Instead, the resolver should (MUST?) rely on the presence or absence of labels containing such characters in the DNS to determine their validity. 2.2.4. Stringprep Processing As above, the validated Unicode string is normalized (using Stable NFKC) and case-mapped. IDNA2003 uses explicit codepoint tables in Stringprep to accomplish both of these operations. Klensin Expires April 19, 2007 [Page 7] Internet-Draft IDNAbis Issues October 2006 2.2.5. Post-Nameprep Processing Any necessary processing is applied to the normalized and case-mapped output string from the above. 2.2.6. Punycode Conversion The validated string is converted to Punycode. 2.2.7. Name Resolution The Punycode-encoded form of the label is looked up in the DNS, using normal DNS procedures. 3. IDNA200x Document List [[anchor15: This section will need to be extensively revised or removed before publication.]] The following documents are expected to be produced as part of the IDNA200x effort. o This document, containing an overview and rationale. o A document describing the "BIDI problem" with Stringprep and proposing a solution [IDNA200X-BIDI]. o A list of initially permitted code points, based on Unicode 5.0 code blocks. See Section 4. o [[anchor16: ...More ??? ...]] 4. Permitted Characters: An inclusion list Moving to an inclusion model requires a new list of characters that are permitted in IDNs. An initial version of such a list has been developed by the contributors to this document [IDNA200X-Blocks]. This was accomplished by going through Unicode 5.0 one block and one character class at a time and determining which characters, classes, blocks were clearly acceptable for IDNs, which one were clearly unacceptable (e.g., all blocks consisting entirely of compatibility characters and non-language symbols were excluded as were a number of character classes), and which blocks and classes were in need of further study or input from the relevant language communities. The discussion in [IDNA200X-BIDI] illustrates areas in which more work and input is needed. It is expected that such problems will be Klensin Expires April 19, 2007 [Page 8] Internet-Draft IDNAbis Issues October 2006 resolved quickly and the questioned scripts added to the list of permitted characters. A procedure for adding additional characters to the inclusion list, either from blocks that are associated with notes in [IDNA200X-Blocks] or from future versions of Unicode, will be developed as part of this work. A key part of that procedure will be specifications that, in fact, make it possible to add new characters and blocks without long delays in implementation. For example, it may be desirable to more strongly distinguish between use of the protocols for "registration" (i.e., entering names in the DNS) and "lookup" (queries to the DNS), with most character inclusion rules applied at registration time only and clients generating queries relying on the lookup process to return "not found" errors if characters were invalid. [[anchor17: That procedure is an important issue and this is a placeholder.]] 5. The Question of Prefix Changes The conditions that would require a change in the IDNA "prefix" ("xn--" for the version of IDNA specified in [RFC3490]) have been a great concern to the community. A prefix change would clearly be necessary if the algorithms were modified in a manner that would create serious ambiguities during subsquent transition in registrations. This section summarizes our conclusions about the conditions under which changes in prefix would be necessary. 5.1. Conditions requiring a prefix change An IDN prefix change is needed if a given string would resolve or otherwise be interpreted differently depending on the version of the protocol or tables being used. Consequently, work to update IDNs would require a prefix change if, and only if, one of the following four conditions were met: 1. The conversion of a Punycode string to Unicode yields one string under IDNA2003 (RFC3490) and a different string under IDNA200x. 2. An input string that is valid under IDNA2003 and also valid under IDNA200x yields two different Punycode strings with the different versions . This condition is believed to be essentially equivalent to the one above. Note, however, that if the input string is valid under one version and not valid under the other, this condition does not Klensin Expires April 19, 2007 [Page 9] Internet-Draft IDNAbis Issues October 2006 apply. See the first item in Section 5.2, below. 3. A fundamental change is made to the semantics of the string that is inserted in the DNS, e.g., if a decision were made to try to include language or specific script information in that string, rather than having it be just a string of characters. 4. Sufficient characters are added to Unicode that the Punycode mechanism for offsets to blocks does not have enough capacity to reference the higher-numbered planes and blocks. This condition is unlikely even in the long term and certain to not arise in the next few years. 5.2. Conditions not requiring a prefix change In particular, as a result of the principles described above, none of the following changes require a new prefix: 1. Prohibition of some characters as input to IDNA. This may make names that are now registered inaccessible, but does not require a prefix change. 2. Adjustments in Stringprep tables or IDNA actions, including normalization definitions, that do not impact characters that have already been invalid under IDNA2003. 3. Changes in the style of definitions of Stringprep or Nameprep that do not alter the actions performed by them. 6. Stringprep Changes and Compatibility Concerns have been expressed that, in attempting to improve the handling of IDNs, changes will be made to Stringprep that will cause problems for other uses of that specification, notably protocols used for identification or authentication. The section above (Section 5) essentially applies in this context as well: the proposed new inclusion tables [IDNA200X-Blocks], the reduction in the number of characters permitted as input to Stringprep Section 4, and even the proposed changes in handling of right-to-left strings [IDNA200X-BIDI] either give interpretations to strings prohibited under IDNA2003 or prohibit strings that IDNA2003 permitted. Strings that are valid under both IDNA2003 and IDNA200X, and the corresponding versions of Stringprep, are not changed in interpretation. Perhaps even more important in practice, since the other known uses of Stringprep encode or process characters that are already in normalized form and expect the use of only those characters that can Klensin Expires April 19, 2007 [Page 10] Internet-Draft IDNAbis Issues October 2006 be used in writing words of languages, the changes proposed here and in [IDNA200X-Blocks] are unlikely to have any impact at all. 7. Display and Network order For correct treatment of domain names one must distinguish between Network Order (the order in which the codepoints are sent in protocols) and Display Order (the order in which the codepoints are displayed on a screen or paper). The order of one label in a domain name is discussed in [IDNA200X-BIDI]. But there are also questions about the order in which labels are to be displayed if left-to-right and right-to-left labels are adjacent to each other, especially after more than one appearance of one of the types. That decision is ultimately under the control of user agents --including web browsers, mail clients, and the like-- which may be highly localized. Even when formats are specified by protocols, the full composition of an Internationalized Resource Identifier (IRI) [RFC3987] or Internationalized Email address contain elements other than the domain name. For example, IRIs contain protocol identifiers and field delimiter syntax such as "http://" or "mailto:" while email addresses contain the "@" to separate local parts from domain names. User agents are not required to use those protocol-based forms directly but often do so. Do the protocol constraints imply that the overall direction of these strings will always be left-to-right (or right-to-left) for an IRI or email address? Should they? These questions could have several possible answers. If one has a domain name abc.def in which both labels are represented in scripts that are written right-to-left, should it be displayed as fed.cba or cba.fed? One can notice that, in network order, an IRI for clear- text web access would begin with "http://" and the characters will appear as "http://abc.def". But what does this suggest about the display order? When entering a URI to many browsers, one may possibly enter only the domain name (leaving the "http://" to be filled in by default and assuming no tail -- an approach that does not work for other protocols). The natural display order for the typed domain name on a right-to-left system is fed.cba. Does this change if a protocol identifier, tail, and the corresponding delimiters are specified? While logic, precedent, and reality suggest that these are questions for user interface design, not IETF protocol specifications, experience in the 1980s and 1990s of mixing systems in which domain name labels were read in network order (left-to-right) and those in which those labels were read right-to-left would predict a great deal of confusion, and heuristics that sometimes fail, if each implementation of each application makes its own decisions on these Klensin Expires April 19, 2007 [Page 11] Internet-Draft IDNAbis Issues October 2006 issues. It should be obvious that any revision of IDNA must be more clear about the distinction between network and display order for complete (fully-qualified) domain names as well as just individual labels than the original specification did. It is likely that some strong suggestions should be made about display order as well. [[anchor21: Some specific examples probably needed, although they will need to be spelled out to permit rendering in ASCII.]] 8. The Ligature and Digraph Problem There are a number of languages written with alphabetic scripts in which single phonemes are written using two characters, termed a "digraph", for example, the "ph" in "pharmacy" and "telephone". (Note that characters paired in this manner can also appear consecutively without forming a digraph, as in "tophat".) Certain digraphs are normally indicated typographically by setting the two characters closer together than they would be if used consecutively to represent different phonemes. Some digraphs are fully joined as ligatures (strictly designating setting totally without intervening white space, although the term is sometimes applied to close set pairs). An example of this may be seen when the word "encyclopaedia" is set with a U+00E6 LATIN SMALL LIGATURE AE. Difficulties arise from the fact that a given ligature may be a completely optional typographic convenience for representing a digraph in one language (as in the preceding example), while in another language it is a single character that may not always be correctly representable by a two-letter sequence. This can be illustrated by many words in the Norwegian language, where the "ae" ligature is the 27th letter of a 29-letter extended Latin alphabet. It is equivalent to the 28th letter of the Swedish alphabet (also containing 29 letters), U+00E4 LATIN SMALL LETTER A WITH DIAERESIS, for which an "ae" cannot be substituted acording to current orthographic standards. This character (U+00E4) is also part of the German alphabet where, unlike in the Nordic languages, the two-character sequence "ae" is a fully acceptable alternate orthography. The inverse is however not true, and those two characters cannot necessarily be combined into an "umlauted a". This also applies to another German character, the "umlauted o" (U+00F6 LATIN SMALL LETTER O WITH DIAERESIS) which, for example, cannot be used for writing the name of the author "Goethe". It is also a letter in the Swedish alphabet where, in parallel to the "umlauted a", it cannot be correctly represented as "oe". Klensin Expires April 19, 2007 [Page 12] Internet-Draft IDNAbis Issues October 2006 Additional situations with alphabets written right-to-left are described in [IDNA200X-BIDI]. This constitutes a problem that cannot be resolved solely by operating on scripts. It is, however, a key concern in the IDN context. Its satisfactory resolution will require support in policies set by registries, which therefore need to be particularly mindful not just of this specific issue, but of all other related matters that cannot be dealt with on an exclusively algorithmic basis. Just as with the examples of different-looking characters that may be assumed to be the same, as discussed in Section 2.2.6 of [RFC4690], it is in general impossible to deal with these situations in a system such as IDNA -- or Unicode normalization generally -- since determining what to do requires information about the language being used, context, or both. Consequently, IDNAbis makes no attempt to treat these combined characters in any special way. However, this is a prime example of a situation where a registry that is aware of the language context in which labels are to be registered, and where that language sometimes (or always) treats the two-character sequences as equivalent to the combined form, should give serious consideration to applying a "variant" model [RFC3743] [RFC4290] to reduce the opportunities for user confusion and fraud that would result from the related strings being registered to different parties. 9. Right-to-left text In order to be sure that the directionality of text is unambiguous, Stringprep requires that any label in which right-to-left characters appear both starts and ends with characters that are unambiguously directional, and rejects any other string that contains a right-to- left character. This is one of the few places where the IDNA algorithms essentially look at an entire label, not just at individual characters. Unfortunately, the algorithmic model, as defined in Stringprep, fails when the final character in a right-to- left string is "decorated", i.e., requires a combining character to be correctly represented. The combining character is not identified with the right-to-left character attribute, so Stringprep rejects the string. This problem manifests itself in languages written with consonantal alphabets in which vowels are indicated as combining marks, and where they are an essential component of the orthography. Examples of this are Yiddish, written with an extended Hebrew script, and Dhivehi (the official language of Maldives) which is written in the Thaana script (which is, in turn, derived from the Arabic script). Other languages are still being investigated, but Stringprep definitely needs to be adjusted. Klensin Expires April 19, 2007 [Page 13] Internet-Draft IDNAbis Issues October 2006 10. Acknowledgements The editor and contributors would like to express their thanks to those who contributed significant early review comments, sometimes accompanied by text, especially Mark Davis, Paul Hoffman, Simon Josefsson, and Sam Weiler. ... More to be supplied... 11. Contributors While the listed editor held the pen, this document represents the joint work and conclusions of an ad hoc design team consisting of the editor and, in alphabetic order, Harald Alvestrand, Tina Dam, Patrik Faltstrom, and Cary Karp. In addition, there were may specific contributions and helpful comments from those listed in the Acknowledgments section and others who have contributed to the development and use of the IDNA protocols. 12. IANA Considerations While this document does not contain specific actions for IANA, it anticipates the creation of a registry of Unicode blocks and characters permitted in IDNs and a mechanism for expanding that registry. See Section 4. 13. Security Considerations Any change to Stringprep or, more broadly, the IETF's model of the use of internationalized character strings in different protocols, creates some risk of inadvertent changes to those protocols, invalidating deployed applications or databases, and so on. Our current hypothesis is that the same considerations that would require changing the IDN prefix (see Section 5.2) are the ones that would, e.g., invalidate certificates or hashes that depend on Stringprep, but those cases require careful consideration and evaluation. ...???more to be supplied... 14. References Klensin Expires April 19, 2007 [Page 14] Internet-Draft IDNAbis Issues October 2006 14.1. Normative References [FC-NFKC] The Unicode Consortium, "Derived Property: FC_NFKC_Closure", June 2006, . [IDNA200X-BIDI] Alvestrand, H. and C. Karp, "An IDNA problem in right-to- left scripts", October 2006, . [IDNA200X-Blocks] Faltstrom, P., "??? Permitted Character List for IDNA (placeholder)", October 2006, . A version of this document, with color coding to make the categories more clear, and supplemental materials, are available at http://stupid.domain.name/idnabis/00.html [RFC3454] Hoffman, P. and M. Blanchet, "Preparation of Internationalized Strings ("stringprep")", RFC 3454, December 2002. [RFC3490] Faltstrom, P., Hoffman, P., and A. Costello, "Internationalizing Domain Names in Applications (IDNA)", RFC 3490, March 2003. [RFC3491] Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep Profile for Internationalized Domain Names (IDN)", RFC 3491, March 2003. [RFC3492] Costello, A., "Punycode: A Bootstring encoding of Unicode for Internationalized Domain Names in Applications (IDNA)", RFC 3492, March 2003. [RFC3743] Konishi, K., Huang, K., Qian, H., and Y. Ko, "Joint Engineering Team (JET) Guidelines for Internationalized Domain Names (IDN) Registration and Administration for Chinese, Japanese, and Korean", RFC 3743, April 2004. [RFC4290] Klensin, J., "Suggested Practices for Registration of Internationalized Domain Names (IDN)", RFC 4290, December 2005. [Unicode-USX15] The Unicode Consortium, "Unicode Standard Annex #15: Unicode Normalization Forms", 2006, Klensin Expires April 19, 2007 [Page 15] Internet-Draft IDNAbis Issues October 2006 . [Unicode32] The Unicode Consortium, "The Unicode Standard, Version 3.0", 2000. (Reading, MA, Addison-Wesley, 2000. ISBN 0-201-61633-5). Version 3.2 consists of the definition in that book as amended by the Unicode Standard Annex #27: Unicode 3.1 (http://www.unicode.org/reports/tr27/) and by the Unicode Standard Annex #28: Unicode 3.2 (http://www.unicode.org/reports/tr28/). [Unicode40] The Unicode Consortium, "The Unicode Standard, Version 4.0", 2003. [Unicode50] The Unicode Consortium, "The Unicode Standard, Version 5.0", 2006. Forthcoming fourth quarter 2006. Available online at http://www.unicode.org/versions/Unicode5.0.0/ 14.2. Informative References [ICANN-Guidelines] ICANN, "IDN Implementation Guidelines", 2006, . [RFC3987] Duerst, M. and M. Suignard, "Internationalized Resource Identifiers (IRIs)", RFC 3987, January 2005. [RFC4690] Klensin, J., Faltstrom, P., Karp, C., and IAB, "Review and Recommendations for Internationalized Domain Names (IDNs)", RFC 4690, September 2006. Klensin Expires April 19, 2007 [Page 16] Internet-Draft IDNAbis Issues October 2006 Author's Address John C Klensin (editor) 1770 Massachusetts Ave, Ste 322 Cambridge, MA 02140 USA Phone: +1 617 245 1457 Fax: Email: john+ietf@jck.com URI: Klensin Expires April 19, 2007 [Page 17] Internet-Draft IDNAbis Issues October 2006 Full Copyright Statement Copyright (C) The Internet Society (2006). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Acknowledgment Funding for the RFC Editor function is provided by the IETF Administrative Support Activity (IASA). Klensin Expires April 19, 2007 [Page 18]