Re: [bidi] Re: PRI 185 Revision of UBA for improved display of URL/IRIs

From: Martin J. Dürst <duerst_at_it.aoyama.ac.jp>
Date: Fri, 29 Jul 2011 19:32:18 +0900

Hello Mark, others,

On 2011/07/28 5:01, Mark Davis ☕ wrote:
> Just to remind people: posting to this list does *not* mean submitting to
> the UTC. If you want to discuss a proposal here, not a problem, but just
> remember that if you want any action you have to submit to the UTC.
>
> Unicode members via: http://www.unicode.org/members/docsubmit.html
> Others via: http://www.unicode.org/reporting.html

[I'll copy this text to the ima_at_ietf.org mailing list (mailing list of
the EAI (Email Address Internationalization) WG, to have a public
record, because that's the mailing list where most of the discussion
about this draft in the IETF happened, as far as I'm aware of.]

Context
=======

I'm an individual Unicode member, but I'll paste this in to the
reporting form because that's easier. Please make a 'document' out of it
(or more than one, if that helps to better address the issues raised
here). I apologize for being late with my comments.

Substantive Comments
====================

On substance, I don't agree with every detail of what Jonathan Rosenne,
Behdad Esfahbod, Aharon Lanin and others have said, I agree with them in
general. If their documents/messages are not properly submitted, I
include them herewith by reference.

The proposal is an enormous change in the Bidi algorithm, changing its
nature in huge ways. Whatever the details eventually may look like, it
won't be possible to get everything right in one step, and probably
countless tweaks will follow (not that they necessarily will make things
better, though). Also, dealing with IRIs will increase the
appetite/pressure for dealing with various other syntactical constructs
in texts.

The introduction of the new algorithm will create numerous compatibility
issues (and attack surfaces for phishing, the main thing the proposal
tries to address) for a long period of time. Given that the Unicode
Consortium has been working hard to address (compared to this issue)
even extremely minor compatibility issues re. IDNs in TR46, it's
difficult for me to see how this fits together.

Taking One Step Back
====================

As one of the first people involved with what's now called IDNs and
IRIs, I know that the problem of such Bidi identifiers is extremely
hard. The IETF, as the standards organization responsible for
(Internationalized) Domain Names and for URIs/IRIs, has taken some steps
to address it (there's a Bidi section in RFC 3987
(http://tools.ietf.org/html/rfc3987#section-4), and for IDNs, there is
http://tools.ietf.org/html/rfc5893).

I don't think these are necessarily sufficient or anything. And I don't
think that the proposal at hand is completely useless. However, the
proposal touches many aspects (e.g. recognizing IRIs in plain text,...)
that are vastly more adequate for definition in another standards
organization or where a high-bandwidth coordination with such an
organization is crucial (roughly speaking, first on feasibility of
various approaches, then on how to split up the work between the
relevant organizations, then on coordination in details.) Without such a
step back and high-bandwidth coordination, there is a strong chance of
producing something highly suboptimal.

(Side comment on detail: It would be better for the document to use
something like
http://tools.ietf.org/html/rfc3987#section-2.2 rather than the totally
obscure and no longer maintained
http://rfc-ref.org/RFC-TEXTS/3987/chapter2.html, in the same way the
Unicode Consortium would probably prefer to have its own Web site
referenced for its work rather than some third-party Web site.)

Taking Another Step Back
========================

I mention 'high-bandwidth' above. The Unicode "Public Review" process is
definitely not suited for this. It has various problems:
- The announcements are often very short, formalistic, and cryptic
   (I can dig up examples if needed.)
- The announcements go to a single list; forwarding them to other
   relevant places is mostly a matter of chance. This should be improved
   by identifying the relevant parties and contacting them directly.
- To find the Web form, one has to traverse several links.
- The submission is via a Web form, without any confirmation that the
   comment has been received.
- The space for comments on the form is very small.
- There is no way to make a comment public (except for publishing it
   separately).
- There is no official response to a comment submitted to the Web form.
   One finds out about what happened by chance or not at all.
   (compare to W3C process, where WGs are required to address each
    comment formally, and most of them including the responses are
    public)
- The turnaround is slow. Decisions get made (or postponed) at UTCs
   only.
Overall, from an outsider's point of view, the review process and the
review form feel like a needle's ear connected to a black hole.

[I very much understand that part of the reason the UTC works the way it
works is because of its collaboration with ISO/IEC committees. And I
don't think any other standards organization has a perfect process. But
what's appropriate for one part of the UTCs work may not be appropriate
for other parts of its work (such as the matter at hand).]

Conclusion
==========

I herewith very strongly recommend that the UTC, besides using the
upcoming meeting to advance discussion on the technical issues that the
proposal raises,
a) Postpone the decision to adopt any of the proposed changes,
independent of details, until such time as point b) is implemented and
executed.
b) Swiftly take the necessary steps for a much better, high-bandwith
coordination of this topic and the various issues it encompasses, both
using the existing liaison mechanism and using public discussion on an
appropriate forum (e.g. one of the relevant IETF mailing lists
(idna/eai/iri)).
c) Seriously work on improving the process for soliciting and addressing
comments from the public and relevant stakeholders.

Regards, Martin.
Received on Fri Jul 29 2011 - 05:39:34 CDT

This archive was generated by hypermail 2.2.0 : Fri Jul 29 2011 - 05:39:37 CDT