RE: Directionality Standard

From: Jony Rosenne (jr@qsm.co.il)
Date: Thu Dec 20 2007 - 10:48:04 CST

  • Next message: Hohberger, Clive: "RE: CLDR Usage of Gregorian Calendar Era Terms: BC and AD -- Can we please have "CE" and "BCE" ?"

    The difficulty with invisible marks is that they are not visible and thus
    are easily overlooked.

    Jony

    -----Original Message-----
    From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org] On
    Behalf Of Asmus Freytag
    Sent: Thursday, December 20, 2007 10:31 AM
    To: Kent Karlsson
    Cc: 'Stephane Bortzmeyer'; 'Behnam'; unicode@unicode.org
    Subject: Re: Directionality Standard

    Kent,

    adding a LRM or RLM at the head of the paragraph allows the Unicode text
    itself to carry an indication of the desired top-level directionality.
    That indication will be picked up by any implementation of the *default*
    algorithm (but is easily overridden by any external markup in protocols
    that support it.). The way it works, is that the mark counts as a letter
    with strong directionality, in this case the first strong letter used
    for setting the top-level directionality, while being otherwise
    invisible in the display.

    A./

    On 12/19/2007 3:02 PM, Kent Karlsson wrote:
    >
    > Stephane Bortzmeyer wrote:
    >
    >>> Can't a Hebrew site have a news in Hebrew, with a long quotation of
    >>> the speech of an American politician in English in an ltr paragraph?
    >>>
    >> Yes, and Unicode handles it fine, in plain text, without the need for
    >> support from a markup language (because each Unicode character has a
    >> direction).
    >>
    >
    > No, that's not the issue. The display of a line of bidi text (with
    > actual mix of directions) becomes completely different depending on
    > the top level paragraph direction. That is NOT derived from "each
    > Unicode character has a direction" (considering just those that
    > have strong directionality).
    >
    > The initial poster in this thread gave a good example. But here is a
    > simpler one, using the convention that uppercase denotes RTL letters.
    > The *same* input text, logical order "ABCdefGHI", gets the display
    >
    > CBAdefIHG if the top level direction is LTR (a.k.a. level 0)
    > IHGdefCBA if the top level direction is RTL (a.k.a. level 1)
    >
    > The top level paragraph direction is not inherent in the text (and
    > *cannot* be), though the bidi algorithm specifies a default, but just
    > a default, usually overriden by markup (or language tag) when markup
    > (or language tag) is available, since the default is not stable for
    > editing (unless the editor forces the use of a LRM or RLM char at the
    > beginning of each paragraph).
    >
    > /kent k
    >
    >
    >
    >
    >



    This archive was generated by hypermail 2.1.5 : Thu Dec 20 2007 - 10:50:50 CST