UTC - Holam proposals

From: Jony Rosenne (rosennej@qsm.co.il)
Date: Mon Aug 02 2004 - 22:44:24 CDT

Additional remarks (collected from other items in the list and slightly edited):

I consider the interchange of texts to one of the main aims of Unicode. Otherwise, we could just tell the users to use the PUA for any character they miss.

The distinction between the two instances of Holam should be ignorable in the sense that if a rendering system chooses not to make the distinction, which is a valid style for Hebrew, then it can do so, without affecting the text, even if the text did make the distinction. The ZWNJ proposal meets this requirement.

The Holam Haser for Vav proposal will create interchange problems not only with the existing data, which for some reason some people do not consider important, but also for new data created with the new character and displayed on a system that does not implement it.

> -----Original Message-----
> From: hebrew-bounce@unicode.org
> [mailto:hebrew-bounce@unicode.org] On Behalf Of Michael Everson
> Sent: Tuesday, August 03, 2004 4:40 AM
> To: hebrew@unicode.org
> Cc: Peter Constable; Ken Whistler
> Subject: [hebrew] Re: Holam background document
> At 15:04 -0700 2004-08-02, John Hudson wrote:
> >As far as I'm concerned, we're trying to correct a mistake in the
> >Hebrew block. There is only *one* logical way to do this that is
> >perfectly consistent with the character/glyph model and the identity
> >of the dot on holam male. That is to separately encode the dot for
> >holam male character that should have been encoded in the first
> >place.
> I disagree, John. Ada Yardeni says quite explicitly:
> "Holam [is] a dot above a Waw or to the left of the upper corner of
> any letter."
> and
> "The Holam on the Waw should be placed above the centre of its
> "head", while the Holam on other letters should be placed to the left
> of their upper corners."
> Yardeni has described perfectly what I judge to be consensus on this
> list. She describes the default position of the POINT HOLAM on
> consonants, and describes its default position on VAV in particular.
> Statistics have certainly shown that this description is the best,
> and that holam male is the default use of POINT HOLAM on VAV.

It is quite clear that she ignores Vav Haluma. The Holam of Vav Haluma is
just as it is for other letters, above left, and is not distinct in any way.

> >It should be perfectly obvious to anyone that this rejection forces
> >us into a position of compromise, because whatever solution is
> >selected will not be the one logical solution that is consistent
> >with the character/glyph model and the identity of holam male dot as
> >a separate character.
> I disagree. I believe that when the tradition moves the dot it is
> creating a new character, whether or not the dot has a similar
> function. Of course, a dot is a dot. But n-with-dot-above is used to
> represent the velar nasal consonant in Latin transliterations of
> Brahmic scripts, and n-with-dot-below is used to represent the
> retroflex nasal consonant. Moving the dot makes a difference, and it
> makes sense to encode the two characters separately.
> Similarly we have the case in Hebrew. The tradition agrees that the
> position of the dot has significance. The tradition -- and I consider
> Yardeni to be authoritative in speaking for the tradition -- suggests
> quite clearly what the default behaviour of HOLAM is.

This is based on a superficial and incorrect understanding of the text and
of the Hebrew script in general.

> The tradition places a dot in a different place when it has a
> different meaning. It appears to be a unique solution, intended for
> use only with a particular letter, which is why Mark and I called it
> >It is precisely because the solution will be a compromise that no
> >one is going to be completely happy with the result, but also why
> >there is no point on standing on principle or launching objections
> >from fundamentals that have already been compromised. We need a
> >solution that works reliably and gets the job done, and that's all
> >we can really hope for. Purity isn't on the table.
> The joiner "solution" is an ill-conceived hack which tries to press
> the ZWNJ, whose behaviour is appropriate for cursive scripts (like
> Arabic or Brahmic scripts), into service for the non-cursive Hebrew
> script, on foot of a presumption that Hebrew points form ligatures
> with Hebrew base-letters. If "purity" is to be considered, the
> intrinsic cursivity or non-cursivity of a script, and indeed the
> concept of "ligature", should be taken into account.

The Latin-Greek-Cyrillic scripts have a very different attitude to combining
marks when compare to Hebrew and Arabic.

The joiner solution is probably a hack, but I don't see why it is
ill-conceived. If the behavior is accepted by Unicode for one script, it may
be used for another. I don't think it is valid to categorize scripts to
cursive and non-cursive, and if one were to categorize scripts in respect to
the way they handle combining marks I would think that Hebrew (and Arabic,
for different reasons) would have their own categories.

> The HOLAM HASER FOR VAV proposal is simple. It recognizes that HEBREW
> POINT HOLAM is the default character used for a dot above
> representing [o] in the Hebrew script. It recognizes that there are
> many implementations which do not distinguish VAV + HOLAM graphically
> when it could mean either [o] or [vo]. It recognizes that, in
> implementations which *do* make such a distinction, that the dot for
> VAV + HOLAM with the meaning [o] is centred over the VAV, rather than
> positioned to the left as it is for all other characters, and it
> recognizes that the dot for VAV + HOLAM with the meaning [vo] is
> positioned to the left. It recognizes that the left positioning of
> the HOLAM over VAV is a *marked* positioning, and it is for this
> reason that a new character was proposed for this particular usage.

There is no reason to distinguish between a Holam Haser point on a Vav and a
Holam Haser point on any other letter. It has the same appearance and the
same semantics. There is no basis for such a distinction. Unicode does not
encode glyphs, it encodes characters. A Holam Haser point is a Holam Haser
point, always.

The problem we have is with the Holam Male, and should be solved in that
context. The first proposal attempts to do this. The UTC should decide
whether this is an acceptable use of ZWNJ (or ZWJ) or not, and if it is
there is no need for a disruptive new character.

> This is completely analogous to the model accepted by the UTC and WG2
> for the character QAMATS QATAN. The proposal for that character
> recognized that there are many implementations which do not
> distinguish QAMATS when it could mean either [a] or [o]. It
> recognized that in implementations which *do* make such a
> distinction, that the shape for QAMATS QATAN with the meaning [o] is
> different from the normal shape for QAMATS, and it recognized that
> the shape for QAMATS QATAN with the meaning [o] is larger, or
> otherwise differently shaped. It recognized that the special shape of
> the QAMATS QATAN is a *marked* shape, and it is for this reason that
> a new character was proposed for this particular usage.

The Qamats proposal and the UTC did not address interchange between users
who make the distinction and those who do not. Qamats Qatan should have at
least a compatibility decomposition to Qamats.

> The fact that the distinction between QAMATS and QAMATS QATAN was
> made in the twentieth century, and that the distinction between HOLAM
> and HOLAM HASER FOR VAV was made in the eleventh century, is
> irrelevant to the logic which suggests that both characters indicate
> a marked form different from a default unmarked form. Both solutions
> use the same logic for differentiating between an marked and an
> unmarked representation for Hebrew text.

But there is no distinction between Holam Haser for Vav and, for example,
Holam Haser for Yod. Yet the proposal would make them different.

This proposal should not be accepted.

The UTC should either accept the alternative proposal, using ZWNJ, as a
reasonable compromise which has been extensively and intensively discussed
and accepted by most Hebrew users, or accept neither and request a better
compromise. Since the issue is 1100 years old, a few more months will not be
so significant, relatively speaking.


> --
> Michael Everson * * Everson Typography * * http://www.evertype.com