Re: Back to Hebrew -holem-waw vs waw-holem

From: Peter Kirk (
Date: Tue Jul 29 2003 - 20:22:03 EDT

  • Next message: Janusz S. Bień: "Re: UTF-8 and HTML import into MS Word 2000"

    On 29/07/2003 11:20, Ted Hopp wrote:

    >Okay -- there are two Hebrew vowels that are not encoded in Unicode. Their
    >(transliterated) Hebrew names are (caps indicate syllable accent): khoLAM
    >maLE and shuRUQ. The kholam male LOOKS like a "vav with holam" [05D5.05B9]
    >or the alphabetic presentation form FB4B (HEBREW LETTER VAV WITH HOLAM) and
    >the shuruq LOOKS like a vav with dagesh [05D5.05BC] or the alphabetic
    >presentation form FB35 (HEBREW LETTER VAV WITH DAGESH). (For the record, the
    >Unicode HEBREW POINT HOLAM [05B9] is usually called khoLAM khaSER in
    >The two vowels kholam male and shuruq have nothing to do with the consonant
    >vav (HEBREW LETTER VAV) other than that they are written with the same
    >glyph. In unpointed Hebrew text, the vav glyph is used to represent these
    >vowels but, outside of ketiv male, the use is often optional (although
    >sometimes strictly determined by tradition). (For instance, the name Aharon
    >appears in Hebrew bible scrolls sometimes with a vav glyph after the resh
    >and sometimes without. It would be nice if I could search for all
    >occurrences of the name by doing a "match consonants only" search instead of
    >having to resort to regular expressions.) In some texts (e.g., many of the
    >books published by ArtScroll), the kholam male and vav with kholam are
    >rendered differently--the former with the dot centered above the vav and
    >latter with the dot somewhat more to the left. I have not seen a text that
    >renders a shuruq differently than a vav with dagesh. (However, a dagesh has
    >nothing to do with a shuruq, despite the nice little note in the Unicode
    >code chart. A consonantal vav with a dagesh is NOT a shuruq.)
    Thanks for this useful information.

    >Furthermore, context cannot be used to distinguish vav with kholam vs.
    >kholam male. As I posted once before, at least one major dictionary uses a
    >single consonant with both a patah and a kholam male (NOT a consonantal vav
    >with kholam) to transliterate foreign words. Hebrew characters are used for
    >much more than spelling Hebrew words.
    Good point. The algorithm I suggested works only for orthographically
    regular Hebrew.

    >These different uses for the same (or approximately same) glyphs cannot, as
    >far as I know, be distinguished in Unicode. (Putting a HEBREW POINT HOLAM in
    >front of a HEBREW LETTER VAV would just associate the kholam with the
    >preceding letter.) It might be nice if there were different code points for
    >them. Alphabetic presentation forms don't quite do the trick. When I first
    >saw it, I had assumed that FB4B was supposed to be used for kholam male (and
    >that's what we use it for in our code). Of course, I could have assumed that
    >it was intended for (consonantal) vav with kholam. However, that sequence
    >automatically renders with the dot more to the left, so (for us) a
    >presentation form was unnecessary in that case. Will all font designers who
    >include Hebrew alphabetic presentation forms conform to my assumptions? Can
    >anyone authoritatively say what was intended? I don't think so. This is a
    U+FB4B has a canonical decomposition into vav holam, so cannot be used
    for anything distinct from vav holam. Maybe it was originally intended
    for holam male, but if so the people who defined the decomposition
    forgot that. But there is nothing to stop the UTC defining a new
    character HEBREW LETTER HOLAM MALE with no canonical decomposition (but
    perhaps a compatibility one), a glyph with the holam clearly to the
    right, and a note explaining the distinction from vav plus holam. That
    would be one sensible way ahead.

    >Other typographic curiosities: The HEBREW POINT QAMATS [05B8] is used for
    >two Hebrew vowels: qamats katan (pronounced in Israeli Hebrew like the 'o'
    >in American English 'corn', as is kholam male) and qamats gadol (pronounced
    >like 'a' in American English 'father', as is patah when not under a final
    >HE, HET, or AYIN). Dictionaries usually list the two as separate vowels but
    >render them identically. HOWEVER, some text publishers now distinguish these
    >two vowels typographically (e.g., Revised Siddur Sim Shalom published by the
    >Rabbinical Assembly). Perhaps there should be an alphabetic presentation
    >form for qamats katan.
    The two qamatses were distinguished as early as 1850 in Benjamin
    Davidson's "The Analytical Hebrew and Chaldee Lexicon", of which I have
    a facsimile edition. But Davidson did not distinguish the holam vavs or
    the shevas.

    >The same comment goes for HEBREW POINT SHEVA [05B0]: in pronunciation it
    >comes in two flavors, called sheva na ("moving sheva" -- pronounced
    >something like the vowel segol) and sheva nakh ("resting sheva" -- silent).
    >Again, most dictionaries list these as separate vowels but render them
    >identically, while some publishers now distinguish them typographically
    >(e.g., Tikkun Korim Simanim, published by Feldheim). Again, should there be
    >an alphabetic presentation form for sheva na?
    >With that, I'll leave off.
    >Ted (not content with a focussed discussion)

    Peter Kirk

    This archive was generated by hypermail 2.1.5 : Tue Jul 29 2003 - 21:02:18 EDT