Re: Jumping Cursor. Was: Right-to-Left Punctuation Problem

From: Gregg Reynolds (unicode@arabink.com)
Date: Mon Aug 01 2005 - 23:38:54 CDT

  • Next message: David Starner: "Re: Jumping Cursor. Was: Right-to-Left Punctuation Problem"

    Mark E. Shoulson wrote:
    > Gregg Reynolds wrote:
    >>
    >> Unfortunately, that is not what this is about. I'll say it yet again:
    >> Arabic (like other RTL written languages) is *monodirectional*. Where
    >> this idea of "inherent" bidirectionality got started I'd like to know,
    >> so I could deliver a scrumptious knuckle sandwich. Anybody who still
    >> buys into this pernicious piece of mythology is welcome to email me,
    >> and I will try to put the worms out of your head.
    >>
    > Unicode has, for reasons of consistency and other kinds of sanity, seen
    > fit to declare that decimal-coded numbers shall be encoded
    > most-significant-digit first. Why that order? Who cares? It suffices

    Think economics. The choice of digit polarity has huge impact. It is
    the sole reason the RTL world has so little good software. RTL display
    and contextual shaping are relatively trivial. Bidi algorithmics are
    not. Ask yourself, why does Emacs not yet support Arabic et al.?
    Without the bidi requirement imposed by the basic design decision to go
    absolute MSD, it would have been supported years ago, I daresay.

    You can repeat that 1000-fold. Got a fav piece of software and want to
    use it for Arabic? I'm willing to bet a fine bottle of Night Train that
    the answer will be "sorry, bidi is too expensive". Not "RTL layout" or
    "contextual shaping", mind you, but bidi support. If we had RTL digits
    etc. that barrier would go away instantly.

    If you have any economist friends ask them to ponder the total cost of
    the bidi requirement. I wouldn't be surprised if it ran into the $billions.

    > that they needed there to be *one* ordering. It's not about Arabic
    > being "inherently bidirectional", that's a consequence of the fact that
    > the imposed ordering goes against it.

    Ah! Agreed! Somebody understands! But see the text of the Unicode
    standard I quoted in my note to Ken.

       Unfair to the Arabic-writers?
    > Probably, but them's the breaks when standards happen. RTL is in the
    > minority and gets the short end of the stick sometimes (and for other
    > reasons).

    True, but in this case there is a remedy, namely RTL digits and punctuation.

       You might have *some* claim wrt Arabic numerals (by which I
    > mean those digits used when writing Arabic; you know what I mean), but
    > Israelis use just plain ol' ordinary numbers like the rest of the
    > Western world, and nobody will believe that it's a "different" 7 in
    > David than in Times.

    Commonly used in Arabic pubs too, although they tend to look pretty ugly
    - a sure sign that either the software was weak or the designers
    couldn't figure out how to get Arabic-Indic figures to display. (I've
    spent a good bit of time recently getting numbers to display in
    Arabic-Indic in word docs for people who can't quite figure it out.)

    But which glyphs are used is not really relevant. The semantics would
    guarantee both correct layout and proper treatment as numbers.

    Ideally, all digit ranges used in RTL languages would have a set of
    strongly RTL codepoints, including the European forms, standard
    Arabic-Indic, and Easter Arabic-Indic.

    Sincerely,

    gregg



    This archive was generated by hypermail 2.1.5 : Mon Aug 01 2005 - 23:41:52 CDT