Re: Multiple Directions (was: Re: Coptic/Greek (Re: Phoenician))

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Fri May 14 2004 - 07:25:14 CDT

  • Next message: Dean Snyder: "Re: interleaved ordering (was RE: Phoenician)"

    From: "Andrew C. West" <andrewcwest@alumni.princeton.edu>
    (...)
    > As has been stated time and time again, mixing vertical and horizontal textual
    > orientation in the same document is beyond the scope of a plain text standard,
    > and rendering mixed horizontal/vertical text is certainly beyond the ability
    of
    > any plain text editor that I know of. Markup is the appropriate way to deal
    with
    > mixed horizontal/vertical/diagonal/circular/spiral text (Artemis Fowl has a
    > constructed "script" with spiral textual orientation), not dozens of new
    > directional control characters.

    What about boustrophedon? Isn't there also some vertical boustrophedon layout,
    i.e. TTB and BTT alternated on each vertical row?

    My opinion is that, whatever the directionality used, it does not matter. Bidi
    character properties are only useful when handling local changes of
    directionality within the same document, so that they require reordering when
    rendering mixed scripts, before the final main directionality is applied (this
    final main directionality could use horizontal/vertical rows or whatever, and
    the direction of rows can be contant or alternated; it does not matter and this
    is out of scope of Unicode encoding).

    LRO/LRM/BDE controls and so on are to be used to override the main direction of
    characters belonging to the same script, when they are used in contexts where
    the main direction must be escaped. BiDi character properties are there to avoid
    using these controls when they are not necessary. If something is not specified
    in BiDi properties, then the characters will be laid out according to the (out
    of Unicode scope) document directionality.

    May be this should be clarified in the Unicode spec, so that these controls and
    properties are defined in terms of "character direction" (the second "row
    direction" will not be encoded, allowing boustrophedon or unidirectional
    layouts), instead of just "left" and "right".

    The wellknown exception to this directionality model is Hangul whose clusters
    adopt a local horizontal/vertical for rendering their composite jamos in the
    same syllable. If leading and trailing consonnants had not been encoded
    separately, one would need to encode a special punctuation to mark syllable
    boundaries. (This punctuation would not necessarily have a visible glyph, it
    could be a thin space or an arrangement of the text layout, in the traditional
    Hangul squares).

    In Hangul, syllables breaks are marked by the layout, but not word breaks; in
    Latin/Greek/Cyrillic/Hebrew this is the reverse, and I consider SPACE as a
    punctuation; either word breaks or syllable breaks are needed to make the text
    readable, i.e. less ambiguous, to reflect the speech and common semantics of
    words where these breaks are often heard and needed too.

    If this Hangul layout was better understood, and implemented as a layout
    feature, one could easily see that Hangul is extremely simple and regular, and
    has very few letters. (for example SSANGSIOS is currently encoded distinctly
    from SIOS,SIOS, despite the two are identical semantically and should be
    rendered identically, unless one is a trailing consonnant and the other a
    leading one, in which case their separation is either marked in the layout by a
    cluster boundery, or by an explicit punctuation which could as well be a thin
    space character or a small dot mark).

    The current encoding of Hangul ignores this feature, and makes handling Hangul
    unnecessarily complicate, when all could be handled as a strict encoding of an
    "horizontal" row of text, with a special layout to compose squares.
    Square-layout does not seem mandatory in Hangul, and Koreans can also read text
    rendered with uniform halfwidth and unidirectional jamos, making it a true
    alphabet. Vertical presentation is also common for this, and readers that
    already can read text horizontally or vertically would read without much
    problems a boustrophedon layout, or featured layouts like spiral, circular,
    provided that glyph orientation is kept recognizable.

    I see the square layout only as the prefered layout for Koreans, as it fits well
    with Han characters and with its long strong tradition for presentation. Han
    ideographs also have a square layout of strokes. But they are a bit more complex
    because they use many featured ligatures, so that strokes take some contextual
    shapes depending on surrounding strokes and the number of strokes in the square
    (these make the ideographs more readable, by a more uniform distribution of
    blackness and stroke widths within the square, or by enhancing the symetries and
    parallelisms). On the opposite, the limited set of letters (strokes) in Hangul
    and the absence of overlays makes the rendering task much easier within squares.



    This archive was generated by hypermail 2.1.5 : Fri May 14 2004 - 07:25:44 CDT