Re: Directionality Standard

From: Waleed Oransa (
Date: Sun Dec 30 2007 - 11:44:22 CST

  • Next message: James Kass: "Re: Acceptable alembic glyph variants"

    I agree with Behnam,

    It's very important that the Unicode standard encode the original
    direction in the Bidi text. This is mainly to display the Bidi text the
    same way that it has been entered by the user. Many Arabic/Hebrew users
    face this problem on web and many applications. Of course this issue can
    be handled on the application level by storing the directionality as a
    separate value in the DB for example but it doesn't guarantee the
    interoperability of Unicode Bidi text between different applications since
    the directionality is not embedded in the text. The directionality
    encoding within the text will guarantee that any Unicode application is
    capable to display the Bidi text in the correct directionality.

    The missing of a standard way to encode the directionality of the text
    leads to several incompatible implementation as the following examples:
    On Windows: The directionality is determined by the Application and the
    original orientation of the text is not stored or encoded within it.
    On Linux: Gnome/GTK+, the directionality is contextual- depends on the
    first strong character in the text - determined by the GTK widget and the
    original orientation of the text is not stored or encoded within it.
    On Windows: MS Word, Open Office Writer: For Doc/ODF file format, the
    original orientation of the text is stored and encoded inside the Doc/ODF
    file. For text files, the directionality is not stored within the text.
    User can select the directionality in MS-Word at file loading time.

    Let's disuses this issue, and the various ways to reflect this solution in
    Unicode standards.

    Behnam <>
    Sent by:
    22/12/2007 01:32 Ő

    Jony Rosenne <>
    Re: Directionality Standard

    I do not expect all applications support rtl. Because unfortunately I
    am realistic. What I do expect though, is that my rtl paragraph,
    going to an application that doesn't support rtl, and then from that
    application to yet again another application which does support rtl,
    reappear in rtl. The same way that the encoded text reappears intact.
    This is what Unicode should ensure that it happens as it does for the
    characters themselves.

    I also believe that different formats and mediums should solve their
    technical problems without interfering with directionality encoded in
    a text.

    I'm amazed that Unicode has put so much effort in implementing bidi
    algorithm to so many characters, and so little, to keep it useful.


    On 20-Dec-07, at 11:48 AM, Jony Rosenne wrote:

    > The difficulty with invisible marks is that they are not visible
    > and thus
    > are easily overlooked.
    > Jony
    > -----Original Message-----
    > From: [mailto:unicode-
    >] On
    > Behalf Of Asmus Freytag
    > Sent: Thursday, December 20, 2007 10:31 AM
    > To: Kent Karlsson
    > Cc: 'Stephane Bortzmeyer'; 'Behnam';
    > Subject: Re: Directionality Standard
    > Kent,
    > adding a LRM or RLM at the head of the paragraph allows the Unicode
    > text
    > itself to carry an indication of the desired top-level directionality.
    > That indication will be picked up by any implementation of the
    > *default*
    > algorithm (but is easily overridden by any external markup in
    > protocols
    > that support it.). The way it works, is that the mark counts as a
    > letter
    > with strong directionality, in this case the first strong letter used
    > for setting the top-level directionality, while being otherwise
    > invisible in the display.
    > A./
    > On 12/19/2007 3:02 PM, Kent Karlsson wrote:
    >> Stephane Bortzmeyer wrote:
    >>>> Can't a Hebrew site have a news in Hebrew, with a long quotation of
    >>>> the speech of an American politician in English in an ltr
    >>>> paragraph?
    >>> Yes, and Unicode handles it fine, in plain text, without the need
    >>> for
    >>> support from a markup language (because each Unicode character has a
    >>> direction).
    >> No, that's not the issue. The display of a line of bidi text (with
    >> actual mix of directions) becomes completely different depending on
    >> the top level paragraph direction. That is NOT derived from "each
    >> Unicode character has a direction" (considering just those that
    >> have strong directionality).
    >> The initial poster in this thread gave a good example. But here is a
    >> simpler one, using the convention that uppercase denotes RTL letters.
    >> The *same* input text, logical order "ABCdefGHI", gets the display
    >> CBAdefIHG if the top level direction is LTR
    (a.k.a. level 0)
    >> IHGdefCBA if the top level direction is RTL
    (a.k.a. level 1)
    >> The top level paragraph direction is not inherent in the text (and
    >> *cannot* be), though the bidi algorithm specifies a default, but just
    >> a default, usually overriden by markup (or language tag) when markup
    >> (or language tag) is available, since the default is not stable for
    >> editing (unless the editor forces the use of a LRM or RLM char at the
    >> beginning of each paragraph).
    >> /kent k


    This archive was generated by hypermail 2.1.5 : Sun Dec 30 2007 - 11:46:58 CST