Re: basic-hebrew RtL-space ? Combinable accents and vowels needed.

From: kefas (pmr@informatik.uni-frankfurt.de)
Date: Tue Nov 02 2004 - 09:58:24 CST

  • Next message: kefas: "Re: basic-hebrew RtL-space ?"

    RtL-characters are a major break-through in Unicode!
    Please see inserted remarks to your comments!

    On Monday 01 November 2004 10:16 pm, you wrote:
    > From: "kefas" <pmr@informatik.uni-frankfurt.de>
    >
    > > Inserting unicode/basic-hebrew reults in a
    > > convinient RtL, right-to-left, advance of the
    > > cursor, but the space-character jumps to the far
    > > right. Is there a RtL-space?
    > > In MS-Word and OpenOffice I can only change whole
    > > paragraphs to RtL-entry. But quoting just a few
    > > words in hebrew WITHIN a paragraph would be
    > > helpful to many.
    >
    > And this is what the embedding controls are made
    > for: - surround an RTL subtext (Hebrew, Arabic...)
    > within LTR paragraphs (Latin...) with a RLE/PDF
    > pair.
    > - souround an LTR subtext (Latin, ...) within RTL
    > paragraphs (Hebrew, ...) with a LRE/PDF pair.
    >
    > There's no need of a separate RTL space, given that
    > the regular ASCII SPACE (U+0020) character is used
    > within all RTL texts as the standard default word
    > separator, and it inherits it has a weak
    > directionality, that does not force a direction
    > break, but that his inherited from the surrounding
    > text.
    My view: The RtL and LtR -paragraph markers would not
    be needed any more, since at least the basic Hebrew
    characters are inserted and the cursor advanced to the
    left (no matter what paragraph). The SPace is
    bi-directional and in a LtR-paragraph after an
    RtL-word has been typed jumps to the far right
    (assuming that the next insert will be LtR), this is
    an irritation to me. Entering an R-letter it jumps
    back to the left of the R-word. A RtL-SPace would
    remouve this irritation.
    With my keyboar-layout I can type Latin text, press
    CAPSlock, write a few RtL-words and continue after
    CAPSunlock with L-text. I don't need to let go of the
    keyboard to touch the mouse, change
    paragraph-settings, fonts etc. any more (I considered
    these a big waste of time 'till now).
    Also I tried it in WordPad and it works the same there.
    I consider the R-letters in the Unicode a major
    advance, and just need an extra R-space.

    Alternatively I would like the CAPSlock to change the
    default to RtL and the CAPSunlock back to LtR but
    without throwing the R-text to the other end of the
    paragraph (which it occasionally does when clicking on
    the LtR-paragraph sign.

    More on Hebrew:
     
    Meteg,Ethnachta and most accents need to be combinable
    with vowel-points in arbitrary order (Meteg sometimes
    right of the vowel).

    You can't at present even copy the first Word of the
    Hebrew biible, b:resheeth , (typing B+ dagesh. + shva:
    works nieetly, why not the also the other way around?
    they are on different parts: under and in the
    conconant)
    without running into this problem. The 1st sentence
    contains several more examples.

    >
    > A good question however is whever the space should
    > inherit its direction from the previous ctext or the
    > next one.
    > - If the previous text has a strong directionality,
    > then the space should inherit its direction. This
    > should be the case everytime you are entering text
    > with a space at end: it's very disturbing to see
    > this new space shift on the opposite side, when
    > entering some space-sparated hebrew words within a
    > Latin text, because the editor assumes that no more
    > Hebrew will be added on the same line (this causes
    > surprizing editing errors, for example when creating
    > a translation resource file where translated
    > resources are prefixed by an ASCII key, for example
    > when editing a .po file for GNU programs using
    > gettext()).
    > - If the previous text in the same paragraph has no
    > directionality, then it inherits its direction from
    > the text after it (if it has a strong
    > directionality);
    > - if this does not work then a global context for
    > the whole text should be used, or alternatively the
    > directionality of the end of the previous paragraph
    > (this influences where the cursor would go to align
    > such weakly-directed paragraph with the previous
    > paragraph, including the default start margin
    > position.)
    >
    > The regular Bidi algorithm should be used to render
    > a complete text, but strict Bidi rules should not be
    > obeyed everytime when composing a text, where the
    > current cursor position should act as a sentence
    > break with a strong inherited directionality: the
    > text can then be redirected at this position when
    > the cursor moves to other parts of the text.
    >
    > I don't think this is an issue of renderers but of
    > editors (notably in Notepad, where you won't know
    > exactly where to enter a space during edition,
    > unless you use the contextual menu that allows
    > switching the global default directionality, and
    > swap the alignment to the side margins; sometimes,
    > when you want to know where there are REL/RLE and
    > PDF Bidi controls, it's nearly impossible to
    > determine it vizually in Notepad, unless you use an
    > external tool such as native2ascii, from the Java
    > SDK, to change the encoding with clearly visible
    > marks). It's unfortunate, given that Notepad (since
    > Windows XP) offers you a directly accessible
    > contextual menu to enter Bidi controls and change
    > the global direction and alignment to side margins.
    > (But notepad has a "visible controls" editing mode,
    > to solve such ambiguities.)
    >
    > > Related: The other Hebrew characters in the
    > > alphabetic presentation forms insert themselves in
    > > LtR-fashion? Why this difference?
    > > I read about Logical and Visual entry, but don't
    > > see how that answers my 2 questions above.
    >
    > Visual entry should never be used. It was used for
    > some legacy encodings to render text on devices that
    > don't implement the Bidi algorithm and can only
    > render text as LTR. Nobody enters RTL text in
    > "pseudo-visual" LTR order; only the logical input
    > order is needed.
    >
    > But don't mix the input order and the encoding order
    > as they can be different (it should not if the text
    > is converted and stored in Unicode, where only the
    > logical order is legal for any mix of Latin, Greek,
    > Cyrillic, and Hebrew, Arabic).
    >
    > The case for Thai is different because its input
    > order is (historically) visual rather than logical,
    > and then the text is encoded using the same (visual)
    > order. This is not changed with Thai in Unicode, to
    > keep its compatibility with the national Thai
    > standard TIS-620 (and further revizions). So even
    > though Thai uses an non-logical order, its input
    > order and encoding order is the same.
    >
    > The difference of encoding orders is known mainly
    > for historic texts created for modern Hebrew, and
    > more rarely Arabic, or for texts encoded in a
    > private pre-press encoding used to prepare the
    > global layout of pages (these texts are more easily
    > and fast processed in complex page layouts if they
    > are prepared in visual order before flowing them in
    > the page layout template; such applications use
    > specific encodings in a richer rendering context
    > than just plain text, so this is out of scope of the
    > Unicode standard itself).



    This archive was generated by hypermail 2.1.5 : Tue Nov 02 2004 - 10:01:56 CST