RE: Hebrew: glyphs vs. codepoints

From: Jonathan Rosenne (
Date: Mon May 31 1999 - 03:31:37 EDT

Holam: This is a matter of taste. Some printers make this distinction, most don't. The argument that there two different meanings is irrelevant. For example, the letter S represents two distinct sounds in English yet I don't propose to split it into two code points.

Finals: The final Hebrew letters are not defined as compatibility characters in the data base. The views of the Unicode editors are reflected in the text of the standard plus the data base. The matter had been discussed a few times, and our position - against automatic shaping in Hebrew - was accepted.


+AD4- -----Original Message-----
+AD4- From: Arno Schmitt
+AD4- Sent: Monday, May 31, 1999 7:47 AM
+AD4- To: Unicode List
+AD4- Subject: Re: Hebrew: glyphs vs. codepoints
+AD4- Jonathan Rosenne:
+AD4- +AD4- There aren't two Holams. These are glyphs, not characters.
+AD4- The difference between +ACI-matsot+ACI- (loafs of unleavened bread)
+AD4- and +ACI-mitswot+ACI- (obligations)
+AD4- is not a (typo)graphical difference:
+AD4- in +ACI-matsot+ACI- the holam stands to the right of the waw,
+AD4- thus making the waw into a mater lectionis, mere +ACI-carrier+ACI- of the
+AD4- waw (waw plus right holam +AD0- +AFs-o+AF0-),
+AD4- in +ACI-mitswot+ACI- the holam stands to the left of waw, here the waw is
+AD4- a normal consonant (waw plus left holam +AD0- +AFs-wo+AF0-.
+AD4- To say say apodictically: +ACI-There aren't two Holams.+ACI- is not
+AD4- enough+ACE-
+AD4- With alef we have the same two possibilities:
+AD4- in +ACI-rosh+ACI- and +ACI-bo+ACI- the holam sits on the right of the +ACI-carrier+ACI-
+AD4- alef (right holam on alef +AD0- +AFs-o+AF0-)
+AD4- in +ACI-bo'i+ACI- the holam sits on the left of the be because the alef
+AD4- here is the consonant glottal stop giving +AFs-o'+AF0-,
+AD4- in +ACI-'oax+ACI- and +ACI-'otem+ACI- (with tet) we have the left holam on alef:
+AD4- +AFs-'o+AF0-
+AD4- Jony wrote:
+AD4- +AD4- it is not the job of international standards to improve local
+AD4- traditions.
+AD4- I do not propose that you change the way Hebrew is written or
+AD4- printed,
+AD4- I just point out that on computers there are more intelligent
+AD4- input methods than on the typewriter, and that handling of text is
+AD4- easier when in +ACI-tsarix+ACI- and +ACI-tsrixim+ACI- the +ACI-x+ACI- has the same
+AD4- codepoint. +ACI-for most of our purposes, this greatly simplifies
+AD4- writing code to process the text.+ACI- as Mark Leisher put it.
+AD4- May I remind you of what John Cowan quoted and wrote earlier:
+AD4- +ACM- Variant forms of five Hebrew letters are encoded as separate
+AD4- +ACM- characters in all Hebrew standards+ADs- therefore this practice
+AD4- +ACM- is followed in the Unicode Standard. These five variant
+AD4- +ACM- forms are encoded in this block rather than the compatibility
+AD4- +ACM- zone in order to retain structural consistency between this
+AD4- +ACM- block and ISO 8859-8.
+AD4- JC: This tends to indicate, IMHO, that the editors of the Unicode
+AD4- JC: Standard did not view the final forms as anything but
+AD4- compatibility
+AD4- JC: characters, preserved only to make roundtripping with 8859-8
+AD4- JC: and other 8-bit standards easy.

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:46 EDT