5 Hebrew Consonants Shaping

From: Arno Schmitt (arno@zedat.fu-berlin.de)
Date: Thu May 27 1999 - 05:05:53 EDT


there is NO GOOD argument for having different keys for the
final shapes of some Hebrew letters.

Jonathan Rosenne's "argument" is tradition, not reason, so IMHO no
argument at all:
>
> In Hebrew, this is the way we do it since print was invented in the
> 16th century. This is the way it was implemented in typewriters, in
> unit record equipment and in computers.
>

I want to take it a step further -- and thus bringing it to the
area of Unicode proper:
There is no strong good argument for having different codepoints
for this final letters. ZWJ and ZWNJ are there precisely for the
few exceptions.
The only examples given so far (shlep and Philip) are no good
argument:
The Hebrew point Dagesh (+u05BC) transforms a fe into a pe, and
since there is no final form of pe (only for FE -- the Unicode
name for +u05E3 "final pe" is not correct), PE plus Dagesh should
be treated as PE plus ZWNJ => no final form even at the end of a
word.

In German "Gut" (estate, merchandise) and "gut" (fine) are
different word -- similar to French "mere" and "mère", "conte" and
"compte" --,
"Genossen" (comrades) and "genossen" (enjoyed) are not the same
word a all. If you consider words like "Liebe" (the love) and
"liebe (Freunde)" (good friends) there are innumerable pairs. And
different word should be treated differently in most contexts.
But tsarich (needing, singular) and tsrichim (needing, plural),
tsorech (need) and tsrachim (needs) should be treated in many
contexts as the same word ((the Hebrew spelling for them is the
same, just the regular masculine plural ending is adds to the base
word)). Although two of these words are written with the final
shape of KAF (+u 05DA), and the other with the canonical form (+u
05DB), they are the same.

Does this answer Jonys question:
> When do you need such a comparison and for what purpose?
to
Mark Leisher wrote:
>>A common processing example: if we need a comparison or search routine that
>>treats nominal and contextual forms the same, I don't ask a coder to add
>>special rules to handle the special cases. I just tell them to ignore
control
>>characters in their algorithm. Opportunities to introduce bugs just got
>>smaller.



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:46 EDT