[Q] Bidirectional Algorithm

From: Jonathan Rosenne (100320.1303@CompuServe.COM)
Date: Wed Nov 13 1996 - 19:42:57 EST

Kent Johnson wrote:
>1. In step T6, why are implicit directional formatting codes (RLM and LRM,
>according to the description on p 3-15) removed?

I guess it means RLE and LRE, not LRM and RLM. See the preceding line.

>2. Do sot/eot refer to start and end of an embedding level, or only the start
>and end of the entire block being processed? If only to the start and end of
>block, what is the behavior of the algorithm in steps P0 and N1-N3 when a level
>change is detected?

sot and eot refer to the entire text being processed, i.e. up to the block
separator. The P0 question pertains to Arabic, so I cannot answer it. N2 clearly
defines the effect of sot and eot on neutrals.

>4. In step I1, should this read, "Numeric text (EN) goes up two levels unless
>preceeded by left-to-right text AT THE SAME EMBEDDING LEVEL"? If so, then what
>happens to an EN at the beginning of an embedding level? What if the first
>character after sot is EN?

In both cases it is not preceded by l2r text so I1 does apply.

>5. Do quote marks and parentheses affect the embedding level in a special way?
>There is no mention of these characters in the algorithm, and their character
>type is "Other neutral" in UNIDATA2, but ALL of the examples containing quote
>marks on pp 3-20 and 3-21 seem to require either special handling of single and
>double quotes, or use of explicit embedding codes.
>For example, on page 3-21:
>Memory: he said "car MEANS CAR."
>Resolved levels: 000000000222111111111100

>Why is car at level 2? If quote has no special meaning, car should be at level
>0. On the other hand, if quote is interpreted as pushing a level, then why is
>the period at level 0 instead of level 1? The only way I can duplicate this
>result is to insert a LRE between quote and car, and a PDF between CAR and

The example shows the memory and resolved levels at this stage. The LRE etc.
were removed in T6. Obviously there had been either an LRE or an LRO.

>7. An observation: In step N3, the sequences R N EN N R and R N EN N L should
>never occur because in step P0 they resolve to R N AN N R and R N AN N L

P0 applies only to Arabic.

