RE: Directionality Standard

From: Kent Karlsson (
Date: Wed Dec 19 2007 - 17:02:14 CST

  • Next message: John Hudson: "Re: missing chars for Arabic (sequential tanween)"

    Stephane Bortzmeyer wrote:
    > > Can't a Hebrew site have a news in Hebrew, with a long quotation of
    > > the speech of an American politician in English in an ltr paragraph?
    > Yes, and Unicode handles it fine, in plain text, without the need for
    > support from a markup language (because each Unicode character has a
    > direction).

    No, that's not the issue. The display of a line of bidi text (with
    actual mix of directions) becomes completely different depending on
    the top level paragraph direction. That is NOT derived from "each
    Unicode character has a direction" (considering just those that
    have strong directionality).

    The initial poster in this thread gave a good example. But here is a
    simpler one, using the convention that uppercase denotes RTL letters.
    The *same* input text, logical order "ABCdefGHI", gets the display

    CBAdefIHG if the top level direction is LTR (a.k.a. level 0)
    IHGdefCBA if the top level direction is RTL (a.k.a. level 1)

    The top level paragraph direction is not inherent in the text (and
    *cannot* be), though the bidi algorithm specifies a default, but just
    a default, usually overriden by markup (or language tag) when markup
    (or language tag) is available, since the default is not stable for
    editing (unless the editor forces the use of a LRM or RLM char at the
    beginning of each paragraph).

            /kent k

    This archive was generated by hypermail 2.1.5 : Wed Dec 19 2007 - 17:04:12 CST