Resolving Weak and Neutral Types [BIDI]

From: fantasai (
Date: Mon Jul 19 2004 - 15:56:51 CDT

  • Next message: Asmus Freytag: "Re: Back to the subject: Folding algorithm and canonical equivalence"

    I've been going through the Unicode BIDI Algorithm, and I'm having trouble
    understanding the justification for the way rules W7 and N1 are formulated. says:

    # W7. Search backwards from each instance of a European number until the first
    # strong type (R, L, or sor) is found. If an L is found, then change the type
    # of the European number to L.

    # N1. A sequence of neutrals takes the direction of the surrounding strong text
    # if the text on both sides has the same direction. European and Arabic numbers
    # act as if they were R in terms of their influence on neutrals.
    # Start-of-level-run (sor) and end-of-level-run (eor) are used at level run
    # boundaries.

    I understand that these rules are intended to handle things like "BMW 500" in
    the middle of an Arabic text. But it also goes back through list separators as
    in the sequence

      start> SEE SECTIONS 22a, 53, 62, 95c. >end [uppercase => rtl]

    In this case, the entire list (although not the letters/numbers within in each
    item) would be ordered left-to-right instead of right-to-left like the rest of
    the sentence. Is there a reason why W7 searches back through double-CS and N?
    (Other than preparing for N1, which could have been written not to assume W7.)

    I noticed, btw, that none of the examples for N1 have a neutral between two



    This archive was generated by hypermail 2.1.5 : Mon Jul 19 2004 - 15:58:54 CDT