From: Martin J. Dürst (email@example.com)
Date: Thu Mar 04 2010 - 02:23:22 CST
On 2010/03/04 2:12, Shawn Steele wrote:
>> An IRI is a sequence of Unicode characters. Is there not
>> already a well-defined way of converting a sequence of
>> Unicode characters to a visual display?
> The problem (from my perspective at least) is that the Unicode BIDI rules are somewhat "generic".
Yes indeed. It would be nice if we could add support for more and more
stuff with arbitrary complexity to the Unicode bidi algorithm, but I
don't see how that could be deployed.
> Unicode expects things like / and . to be used in a context of same-script stuff, like a date, time or number.
> IRIs use them as delimiters for a list of elements (labels in the domain name or folders in the path), in a hierarchical form.
> The Unicode BIDI algorithm doesn't recognize that there's an underlying hierarchy, so it can end up "swapping" pieces in that hierarchy in some cases.
There's of course a lot of hierarchy, but what's more inherent and basic
is sequence. The URI spec defines the order of the various components,
the hierarchy is more in people's heads than anywhere else.
> I'm not sure UTR#36 is the proper place
I fully agree that UTR#36 is NOT the right place for putting what's
currently in section 4 of the IRI WG draft
In some sense, this would be equivalent to the IRI spec only saying that
IRIs are composed of domain names, path components, query parts,..., and
then saying: Look over there for how to order them on a napkin or on the
side of the bus (or on a display).
UTR#36 already has a good section on 2.5 Bidirectional Text Spoofing
which currently does exactly the right thing, namely say that bidi
display of IDNs and IRIs is, among else, also a security issue.
[off-topic: 2.5.1 in UTR#36 doesn't belong in 2.5, but should be its own
subsection; there is only minor overlap in that Arabic is affected by
both bidi and complex shaping.]
> to clarify display of such ordered lists.
Ok, you got from hierarchy to ordered list, which I think is exactly
what I called 'sequence' above.
> Proper BIDI rendering of IRIs isn't just a security, but also a usability, problem.
Very much so. There are two levels here:
- Interoperability as usability: If there isn't a single, well-defined,
consistent logic <-> visual mapping for IRIs, they are not usable at all.
- Immediate human usability: It should be possible for humans to build
an easily understandable and actionable mental model (or use an existing
mental model that they already have) for bidi IRIs and their visual
> It does seem like perhaps this concept should be mentioned in Unicode somewhere. (IRIs aren't the only place that similar ordered lists happen).
-- #-# Martin J. Dürst, Professor, Aoyama Gakuin University #-# http://www.sw.it.aoyama.ac.jp mailto:firstname.lastname@example.org
This archive was generated by hypermail 2.1.5 : Thu Mar 04 2010 - 02:27:23 CST