L2/11-311 Date/Time: Fri Jul 29 05:31:51 CDT 2011 Contact: duerst@it.aoyama.ac.jp Name: Martin Dürst Report Type: Public Review Issue Opt Subject: PRI #185 Revision of UBA for improved display of URL/IRIs Context ======= I'm an individual Unicode member, but I'll paste this in to the reporting form because that's easier. Please make a 'document' out of it (or more than one, if that helps to better address the issues raised here). I apologize for being late with my comments. Substantive Comments ==================== On substance, I don't agree with every detail of what Jonathan Rosenne, Behdad Esfahbod, Aharon Lanin and others have said, I agree with them in general. If their documents/messages are not properly submitted, I include them herewith by reference. The proposal is an enormous change in the Bidi algorithm, changing its nature in huge ways. Whatever the details eventually may look like, it won't be possible to get everything right in one step, and probably countless tweaks will follow (not that they necessarily will make things better, though). Also, dealing with IRIs will increase the appetite/pressure for dealing with various other syntactical constructs in texts. The introduction of the new algorithm will create numerous compatibility issues (and attack surfaces for phishing, the main thing the proposal tries to address) for a long period of time. Given that the Unicode Consortium has been working hard to address (compared to this issue) even extremely minor compatibility issues re. IDNs in TR46, it's difficult for me to see how this fits together. Taking One Step Back ==================== As one of the first people involved with what's now called IDNs and IRIs, I know that the problem of such Bidi identifiers is extremely hard. The IETF, as the standards organization responsible for (Internationalized) Domain Names and for URIs/IRIs, has taken some steps to address it (there's a Bidi section in RFC 3987 (http://tools.ietf.org/html/rfc3987#section-4), and for IDNs, there is http://tools.ietf.org/html/rfc5893). I don't think these are necessarily sufficient or anything. And I don't think that the proposal at hand is completely useless. However, the proposal touches many aspects (e.g. recognizing IRIs in plain text,...) that are vastly more adequate for definition in another standards organization or where a high-bandwidth coordination with such an organization is crucial (roughly speaking, first on feasibility of various approaches, then on how to split up the work between the relevant organizations, then on coordination in details.) Without such a step back and high-bandwidth coordination, there is a strong chance of producing something highly suboptimal. (Side comment on detail: It would be better for the document to use something like http://tools.ietf.org/html/rfc3987#section-2.2 rather than the totally obscure and no longer maintained http://rfc-ref.org/RFC-TEXTS/3987/chapter2.html, in the same way the Unicode Consortium would probably prefer to have its own Web site referenced for its work rather than some third-party Web site.) Taking Another Step Back ======================== I mention 'high-bandwidth' above. The Unicode "Public Review" process is definitely not suited for this. It has various problems: - The announcements are often very short, formalistic, and cryptic (I can dig up examples if needed.) - The announcements go to a single list; forwarding them to other relevant places is mostly a matter of chance. This should be improved by identifying the relevant parties and contacting them directly. - The submission is via a Web form, without any confirmation that the comment has been received. - The space for comments on the form is very small. - There is no way to make a comment public (except for publishing it separately). - There is no official response to a comment submitted to the Web form. One finds out about what happened by chance or not at all. (compare to W3C process, where WGs are required to address each comment formally, and most of them including the responses are public) - The turnaround is slow. Decisions get made (or postponed) at UTCs only. Overall, from an outsider's point of view, the review process and the review form feel like a needle's ear connected to a black hole. [I very much understand that part of the reason the UTC works the way it works is because of its collaboration with ISO/IEC committees. And I don't think any other standards organization has a perfect process. But what's appropriate for one part of the UTCs work may not be appropriate for other parts of its work (such as the matter at hand).] Conclusion ========== I herewith very strongly recommend that the UTC, besides using the upcoming meeting to advance discussion on the technical issues that the proposal raises, a) Postpone the decision to adopt any of the proposed changes, independent of details, until such time as point b) is implemented and executed. b) Swiftly take the necessary steps for a much better, high-bandwith coordination of this topic and the various issues it encompasses, both using the existing liaison mechanism and using public discussion on an appropriate forum (e.g. one of the relevant IETF mailing lists (idna/eai/iri)). c) Seriously work on improving the process for soliciting and addressing comments from the public and relevant stakeholders. Regards, Martin.