Re: Bug in Bidi

From: Mark Davis (
Date: Tue Dec 26 2000 - 14:45:48 EST

What you have found is a valid bug, but not quite what you describe. Here
are the two principles at issue.

P2. In each paragraph, find the first character that is a strong
directional type (L, AL, R).

P3. If a character is found in P2 and it is of type AL or R, then set the
paragraph embedding level to one; otherwise, set it to zero.

Someone could interpret P2 as applying to all strong types, instead of just
(L, AL, R). The best way to fix it is to make Table 3-7. Bidirectional
Character Types have RLE, RLO, LRE, LRO, PDF in a separate category, not in
Strong (or Weak). This would not disturb the rest of the algorithm, since
these characters are removed before further reference to "strong" or

The reason that these characters are not to affect the paragraph embedding
level is that typically, if anything, the paragraph level should be the
*opposite* of the embedding; the embedding marks that the embedded text is
to be given a *different* direction than the surrounding text.

Mark Davis, IBM GCoC, Cupertino
(408) 777-5850 [fax: 5891],,

Roozbeh Pournader <> on 12-19-2000 02:11:58

To: Mark Davis/Cupertino/IBM@IBMUS
cc: Unicode List <>
Subject: Bug in Bidi

Dear Mark,

It seems that rules P1--P3 of the Unicode bidi algorithm (for determining
paragraph levels) don't make sense in some ways. I think that explicit
directional codes should also be counted in this. So RLO and RLE would
also be able to change the paragraph level to one, which is what a user
expects when she is using an otherwise left to right paragraph in
something like RLO. I know about the solution with RLM, but I think using
both RLO and RLM when an RLO will do is somehow bad. I feel this many
control codes spare.

Would you please comment on this?


This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:17 EDT