Re: UTR# 15 - Objection to Hebrew Exclusions

From: Jonathan Rosenne (
Date: Fri Oct 22 1999 - 12:06:36 EDT

At 06:34 22/10/99 -0700, John Cowan wrote:
>Jonathan Rosenne scripsit:
>> The characters listed:
>> - are not part of the Hebrew subset
>> - are not required to support Hebrew
>> - are not available in Hebrew fonts
>> - are not supported or even recognized by most Hebrew software
>> - are not included in any Israeli national standard
>But they are part of CP1255, a widely supported industry standard.

Only some are, and many CP1255 fonts do not implement them, especially the
Israeli fonts.

>> I would like to observe that most users of Yiddish, in Israel and in
>> Europe, are satisfied with the standard Hebrew support. So we are not
>> excluding the language, only a particular usage.
>Normalization is something new.

I was referring to the Yiddish characters in Unicode and to Yiddish
publications in print, printed in Israel or in Warsaw.

>By making these characters compatibility rather than canonical compositions,
>much of the problem would go away. Hebrew users would continue to avoid
>them: Yiddish users would have them preserved in Normalization Forms
>C and D. This change might have other bad effects, however.

In form D there is no problem. Form C is the problem: It should not
normalize Hebrew texts out of the Israeli subset.

Let us look for example at FB1D # HEBREW LETTER YOD WITH HIRIQ. A Hebrew
text with vowels will contain several occurrences of Hiriq, some following
Yod and others following other letters. For us, there is no difference, the
Hiriq should be treated the same way. But if FB1D were not to be excluded,
then under form D the sequence Yod Hiriq would be changed everywhere to
FB1D, which is not recognized and will display as a blank square or a
question mark.

Since it was recommended that Unicode texts should be pre-normalized at the
source, the user would have no control over it. Hebrew text which passed
through a conforming server would become unusable.

There is a related problem with characters such as 05F0 HEBREW LIGATURE
YIDDISH DOUBLE VAV which is not defined as decomposable to Vav Vav. Most
Yiddish users will type two Vavs, and texts that look very similar will
compare different.

>John Cowan
