RE: BIDI IRI Display (was spoofing and IRIs)

From: Shawn Steele (
Date: Wed Mar 03 2010 - 11:12:57 CST

  • Next message: Jonathan Rosenne: "RE: BIDI IRI Display (was spoofing and IRIs)"

    > An IRI is a sequence of Unicode characters. Is there not
    > already a well-defined way of converting a sequence of
    > Unicode characters to a visual display?

    The problem (from my perspective at least) is that the Unicode BIDI rules are somewhat "generic". Unicode expects things like / and . to be used in a context of same-script stuff, like a date, time or number. IRIs use them as delimiters for a list of elements (labels in the domain name or folders in the path), in a hierarchical form. The Unicode BIDI algorithm doesn't recognize that there's an underlying hierarchy, so it can end up "swapping" pieces in that hierarchy in some cases.

    I'm not sure UTR#36 is the proper place to clarify display of such ordered lists. Proper BIDI rendering of IRIs isn't just a security, but also a usability, problem. It does seem like perhaps this concept should be mentioned in Unicode somewhere. (IRIs aren't the only place that similar ordered lists happen).


    This archive was generated by hypermail 2.1.5 : Wed Mar 03 2010 - 11:17:22 CST