RE: Directionality Standard

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Wed Jan 09 2008 - 17:54:27 CST

  • Next message: Michael Everson: "Re: FPDAM5: Egyptian hieroglyphs (was Re: Marks)"

    Waleed Oransa wrote:
    > It's very important that the Unicode standard encode
    > the original direction in the Bidi text. (...)
    > The missing of a standard way to encode the directionality
    > of the text (...)

    I tend to disagree with those statements. Unicode already offers the proper
    encoding for allowing all this, using Bidi embedding controls. They are
    enough for the intended purpose.

    If you mean a way to encode something that shoul apply to the WHOLE text,
    without limitation, then you'll limit the usability of the text, for example
    in quotations with mixed scripts.

    BiDi embedding controls solve the problem cleanly. But it's still up to the
    authors to use them when and where needed. The only alternative to this
    solution would be to use them before each character, and this would be a
    serious problem requiring all existing texts to be reencoded; the equivalent
    "solution" would be to reenncode ALL the characters with mirroring or
    dependant directionality. But the caveat would be that it would doublethe
    encoding of all the existing texts, creating new forms of "equivalences"
    that could have serious side effects if they were applied systematically.

    I've not encountered any application where the simple addition of a single
    embedding control was not enough to specify the correct ordering and
    presentation of text, provided that they had the minimum needed to support
    the existing BiDi algorithm (i.e. they need to accept the presence of these
    controls, and not discard them or treat them as unknown characters displayed
    with a "character missing" glyph. The incompatible applications anyway are
    those designed only for basic Latin, and that were never internationalized
    properly, or did not use any of the many i18n common libraries that have
    been developed since long now, and integrated in almost all development
    tools or runtime platforms. In most cases, even the simplest applications
    can be recompiled without significant change, just by relinking them with
    updated libraries so that they get the support of BiDi embedding controls.

    The main issue that is more complicate to handle in application is the
    layout of the GUI, however, this is a not related directly to the encoding
    of text, but to user preferences. The text displayed in the GUI elements
    should work properly even if they are not in a gui with RTL layout: they
    appear as paragraphs within the layout, but the paragraphs are shown
    correctly, even if they are not right-aligned (right alignment of the margin
    is often possible in the application, including for LTR scripts, as a
    presentation style option, if there's no global setting that can define this
    style by default for the whole GUI layout. But even in this case, this is
    NOT a problem of text encoding, and it's completely out of scope of Unicode
    conformance rules.



    This archive was generated by hypermail 2.1.5 : Wed Jan 09 2008 - 17:58:16 CST