Hebrew shaping (was RE: Benefits of Unicode)

From: Marco Cimarosti (marco.cimarosti@essetre.it)
Date: Mon Feb 26 2001 - 06:00:12 EST

Sorry for coming back so late on an old issue (29 Jan 2001).

I (Marco Cimarosti) wrote:
> Each different positional form of a letter in Arabic, Syriac or Mongolian
> encoded with the same code point; the rendering engine must select the
> proper form. The same problem in Greek and Hebrew has been addressed using
> different code points for final and non-final letters, that must be
> allocated to separate entries on the keyboard.

Jonathan Rosenne replied:
> Arabic and Hebrew are misleadingly similar in this respect.
> While Arabic shaping is rather regular, Hebrew has too many exceptions,
> making automatic shaping unsuitable.

I tried to find out something on my own but had no success.

All the Hebrew grammar books I have at home just say that the final form of
letters is used at the end of words, full stop. But all my books are things
like "Learn Hebrew yourself in two weeks", and my references about Yiddish
are even more layman level.

Can you describe these exceptions? How frequent are they? In which
language(s) do they occur?

I know that also the Arabic script sometimes deviates from its basic shaping
rules (e.g. in abbreviations, in texts about grammar, and even in ordinary
Farsi spelling), but these exceptions are rare enough that Unicode and other
encoding systems preferred to address them with specialized layout controls
(ZWJ, ZWNJ, TATWEEL). How is Hebrew different?

Toda raba for any info.

_ Marco

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:19 EDT