From: Peter Kirk (email@example.com)
Date: Tue Jul 29 2003 - 10:22:35 EDT
On 28/07/2003 23:37, Jony Rosenne wrote:
>We had a discussion in the SII and the consensus was that we should object
>- any change or addition related to Hebrew that would invalidate existing
>Unicode data or require its modification or re-examination
I can agree that any change should not invalidate existing valid data.
But that shouldn't imply that we must validate existing invalid data.
There is a lot of existing data which, although encoded in Unicode
characters, is invalid or mis-spelled in one sense or another,
deliberately so in order to kludge a reasonably good visual
representation from bad old software. For example, at
www.mechon-mamre.org ZWJ is inserted after vav and before holam when the
vav is a consonant because with certain software and font combinations
that has the required effect of shifting the holam to the left. We can't
simply declare in some kind of amnesty that every existing text is
>- any change or addition to Unicode that would make the use of Hebrew more
>complicated or confuse the common user
Absolutely. But nothing confuses the common user more than not knowing
how he or she is supposed to encode a particular text. What is needed is
not so much changes to Unicode as clear guidelines for the common user.
>- any change or addition to Unicode that would require a user of Hebrew to
>have a higher level of knowledge, e.g. to distinguish between items not
>commonly distinguished, for example the two meanings of Vav with Holam.
Are we confining "user of Hebrew" to people who know how to speak the
language? If so these people already know how to distinguish the two
meanings of vav with holam because they pronounce them quite
differently. Some users of biblical Hebrew may not know the
pronunciation, but I don't think these are the people you have in mind.
On the other hand, if you are determined that these two graphically and
semantically distinct entities should be encoded identically, then at
least those of us who want or need to make a graphical or semantic
distinction are not entirely stuffed i.e. left without a way ahead. For
it does seem to be possible to determine algorithmically, though not
entirely without ambiguity in some theoretical cases, which vav with
holam is which - the only ambiguity would be in cases where the word
before the vav with holam consists only of a string of vavs with dagesh
of which the first may be a vowel (shuruq) or a consonant.
>- the suggestion to encode Biblical Hebrew separately is unacceptable.
I am glad to hear this clearly stated. I agree.
>The requirements of professional and knowledgeable users, such as Biblical
>scholars, should not be allowed to impose upon everyday users who are not
>blessed with such a profound knowledge and understanding.
Indeed. But also support for the special requirements of scholars should
not be restricted just because it goes beyond the requirements of
>Consequently, it was suggested that the several issues with Biblical Hebrew
>recently mentioned, and several more which were not, should be solved by
>means of markup, outside the scope of Unicode. This is how they have been
>addressed in many of the references given. This is our recommendation.
What references are you referring to? Haralambous? I accept that markup
may be suitable for the rare cases of enlarged, reduced, raised and
broken letters which he mentions, as these are semantically the base
letter plus some essentially extra-textual information. But markup is
not appropriate for distinguishing between commonly occurring letters
which are distinct semantically and phonetically, as well as very often
graphically, like the different forms of vav with holam. Or is markup
being suggested as a solution of the Yerushala(y)im issue? If so I fail
to see how it addresses the problem, as markup does not inhibit
>Failing that, it was suggested that an existing Unicode character, such as
>ZERO WIDTH NO-BREAK SPACE, be used for "invisible" Hebrew letters, in cases
>such as Yerushala(y)im.
As there are many objections to ZWNBS, would CGJ be an acceptable
alternative? But I do see why you might prefer to use a zero width base
character here rather than a combining character, although that would
not be appropriate for mittaxat in Exodus 20:4 and for right meteg.
>The third, and least favored, option is to add a special Unicode character
>to represent missing base characters such as the Yod in Yerushala(y)im.
-- Peter Kirk firstname.lastname@example.org http://web.onetel.net.uk/~peterkirk/
This archive was generated by hypermail 2.1.5 : Tue Jul 29 2003 - 11:07:16 EDT