L2/01-089 From: Roozbeh Pournader [roozbeh@sharif.edu] Sent: Friday, January 05, 2001 8:51 AM Subject: A real bug in bidi Dear Unicoders, This time I think we have found a real bug in the Bidirectional Algorithm. The problem is that the algorithm seems to be contradictory with itself. We were trying to use the "Implementation Notes" at the end of UTR#9 to retain the format codes. But that doesn't produce the same results as when removing them in rule X9. We really appreciate any comments. Would you please take your pencils out? ;) Our example is probably not the simplest case, but is small enough: U+202B U+05D1 U+202C U+0031 U+202D U+0061 U+202C BET 1 a When we run the algorithm with the notes in "Retaining Format Codes", we get the following levels: BET 1 a 1 3 3 2 1 2 1 which according to L2 becomes: a BET 1 when rendered visually. That's "a BET 1". But when the format codes are removed in X9, the levels will be: BET 1 a 3 2 2 which becomes "BET 1 a" when rendered. So the order is different, you see. (I do not claim anything about the user expectation in the example, because both are against my expectation. I expected "a 1 BET". I also appreciate comments on your expectations.) We may have made a mistake, I know, but we have checked that many times. I'm giving the medial results I obtained from running the algorithm while retaining format codes here: Original character types: "RLE R PDF EN LRO L PDF" P1-P3: paragraph embedding level becomes 1. X1-X8: levels become "? 3 ? 1 ? 2 ?". modified X9: types become "BN R BN EN BN L BN", levels become "1 3 3 1 1 2 2". X10: four runs, (sor, eor) are (R, R), (R, R), (R, L), (L, L). W1-W5: no change. modified W6: types become "ON R ON EN ON L ON". W7: no change. N1: types become "R R R EN ON L L" N2: types become "R R R EN R L L" I1-I2: levels become "1 3 3 2 1 2 2". modified L1: levels become "1 3 3 2 1 2 1". L2: the ordering becomes " a BET 1 ". --roozbeh ----- Original Message ----- From: "Roozbeh Pournader" To: "Unicode List" Sent: Tuesday, December 19, 2000 02:32 Subject: Another Bug in Bidi > > Dear Mark, > > It seems that rules P1--P3 of the Unicode bidi algorithm (for determining > paragraph levels) don't make sense in some ways. I think that explicit > directional codes should also be counted in this. So RLO and RLE would > also be able to change the paragraph level to one, which is what a user > expects when she is using an otherwise left to right paragraph in > something like RLO. I know about the solution with RLM, but I think using > both RLO and RLM when an RLO will do is somehow bad. I feel this many > control codes spare. > > Would you please comment on this? > > --roozbeh >