Jumping Cursor.

From: Peter R. Mueller-Roemer (pmr@cs.uni-frankfurt.de)
Date: Fri Aug 05 2005 - 08:49:11 CDT

  • Next message: Richard Wordingham: "Re: Proposal for encoding the Vai script in the BMP"

    Over a year ago I there was only mild interest for my problems of
    editing multilingual LINES (bidi, not one-directional paragraphs). Now
    I'm overwhelmed and do not find time to even read all your reactions.
    Gregg Reynolds was the first to understand and support me. Then the
    discussion branched out to
    1. digits . Enclosed by RTL-text it is very simple to enter
    ascii-numbers in the usual LTR manner and I would find it confusing to
    have to enter R-digits in 'reverse' order. I guess, there might be some
    problem with the automatic wrapping?
    But the RTL-text imbedded in a LTR-text presently wraps correctly at
    right. Good!
    I had no problem of cutting and pasting such text. In such contexts the
    mathematical meaning of a sequence of digits is usually clear from the
    context (I have seen only LTR-meaning, I have seen occasionally .1, .2
    ... as verse-numbers at R of the Hebrew text, and pasting such text
    caused some problems in resulting layout.)

    2. parentheses within RTL-text surrounding RTL-text are most easily
    typed as a pair (you won't forget to close it) and then the RTL-text

    Now, Richard T. Gillam has convinced me that my suggestion of an R-SPace
    is not the simplest solution, and joins me in suggesting a rule /
    recommendation by Unicode to the TextProcessingSW-, GUI-,
    KeyboardLayout-makers of how to deal with the typed input and its
    graphiocal representation. I would like to add and to the storage of
    the resulting text (automatic elimination of pairs of opposite direction

    So let us concentrate on how to best support the typing of mixed RTL and
    LTR text and its graphical and storage representation. It is indeed not
    a character-encoding problem, but of coding and decoding keyboard
    out-put and resulting input to editor-SW so that the
    cursor/next-character position is not jumping back and forth when
    entering a Space into RTL-text within an LTR-paragraph.

    I have designed my own tri-lingual keyboard with
    MS-KeyboardLayoutCreator but am frustrated with not seeing a way to
    generate the 3-character R-Space (ending with the PDF-character) and not
    being able to re-code the Arrow-,BS-, Del-Keys. Also I miss the
    opportunity to add a per character documenting comment.

    See more inserted below.
    Peter R. Mueller-Roemer

    Richard T. Gillam wrote:

    >>You missed a strong point of the 'jumping cursor' problem:
    >>SAME LINE typing of text of different directionality should be
    >>even better. The typing of such texts is pretty well supported in
    >>Unicode, so that most Editors and Textprograms can do it even without
    >>providing R2L-PARAGRAPHS. Switching in the middle of the line the
    >>directionality of the paragraph has very undesirable effects.
    >>I should be able to just switch the keyboard-layout and not enter extra
    >>directionality characters and later PDF.
    >Sure, but this isn't a text-encoding issue. It's a keyboard-layout and
    >editing-UI issue. It might be worth it for Unicode to publish a
    >Technical Note or something recommending best practices for dealing with
    >bidirectional text, but that would be the extent of Unicode's
    >involvement. Nothing you complain about above presents a strong case
    >that things are amiss at the _encoding_ level. Furthermore, even if
    >they did, they don't present a case that the cure you're recommending
    >would be better than the disease.
    How can we best promote such a Technical note? I am tired of devising
    special work-arounds for different editors, ...

    >To take another example from Hebrew, I think pretty much everybody
    >agrees that the fixed-position combining class assignments for the
    >points were a bad idea, and that they make properly handling Biblical
    >Hebrew a big pain in the butt, but it was also widely judged that the
    >obvious cure (changing the combining classes) would be worse than the
    >disease. They did fix the problem, but it wasn't nearly as simple a
    Do we have a solution now for the need to combine vowel-point AND
    cantilation-marks under the same base character?
    Why do these combining marks refuse to combine under the precomposed
    consonants with dagesh?
    Also in Greek I don't like it if I can't combine a spiritus lenis with
    an accute accent side by side (they are represented in outdated overtype
    mode). This seems an opportunity for Unicode to suggest some reasonable
    rules of graphical representation and micro-editing of
    combining-sequences with diacritical side-by-side or on top - e.g. by

    >The big difference between that case and this one is that there with the
    >vowel points, there was a problem at the encoding level-- there were
    >things that occurred in real text that simply couldn't be represented.
    >This problem was fixed, but in a suboptimal way. Here, we're not
    >talking about things that can't be represented at all; we're talking
    >about editing UI.
    And I plead for some design and guidance so that the UI's don't continue
    to diverge.

    >>I have thought of how to change editors to behave as I need it for
    >>MULTILINGUAL LINES and came to the conclusion that the easiest, would
    >>to change some keyboard layouts to provide a SPace, TAB and selected
    >>brackets, diacritics, punctuation with the other directionality. Extra
    >>code-points for these would be much preferable to keystrokes sending
    >>old characters enclosed in a pair of directionality characters.
    >Not necessarily. There are lots of reasons why the sequence of three
    >characters is preferable. From the end-user perspective, all that
    >matters is that the software "work right." Whether the keyboard driver
    >is generating one character code or three should be completely
    >immaterial to the end user. (And the fact that the three-character
    >solution is available is precisely why this isn't an encoding problem.)
    >I'm not arguing that the issue you describe isn't a problem; I'm simply
    >arguing that it's a problem with your software, not with Unicode.
    >--Rich Gillam
    > LAS

    This archive was generated by hypermail 2.1.5 : Fri Aug 05 2005 - 08:52:47 CDT