Re: Some control characters test cases

From: Simon Montagu (smontagu@smontagu.org)
Date: Sat Sep 08 2007 - 14:50:38 CDT

  • Next message: Johannes Bergerhausen: "Update decodeunicode.org"

    Itai Bar-Haim wrote:
    > Hi everyone.
    > I'm new to this mailing list, and to unicode.
    > I develop a Bidi/Unicode library for the .Net environment called NBidi
    > (http://nbidi.sf.net).
    > While running test cases, I found problems regarding control characters.
    > I'll only ask about one scenario in this post.
    > The problematic test case is as follows:
    > Given text: <RLO>abc<PDF>
    > What should I expect as a result? My expectation would be (visual,
    > control characters removed) 'cba'.
    > If I leave the control characters in place I would expect: <PDF>cba<RLO>
    > The actual result I get is <PDF>abc<RLO>. This is because of:
    > Rule X4 sets the embedding levels to: 11111
    > Then rule I2 sets the embedding levels to: 12221
    > When performing reordering we get: <RLO>cba<PDF> ==> <PDF>abc<RLO>
    >
    > Am I missing something here?

    Itai,

    What you're missing is that directional overrides change the character
    types as well as the embedding level. Otherwise there would be no
    difference between embeddings and overrides. See the paragraph after
    rule X6:

    "If the directional override status is R, then characters become R. If
    the directional override status is L, then characters become L."

    So in your case the "abc" become R, and since they have an odd embedding
    level rule I2 does not affect them and they remain at embedding level 1.

    >
    > Thank you in advance,
    > Itai.



    This archive was generated by hypermail 2.1.5 : Sat Sep 08 2007 - 14:52:45 CDT