At 12:45 PM 7/29/98 -0700, Jonathan Rosenne wrote:
>>Mark H. David (email@example.com) wrote:
>>>Is the conversion from canonical Hebrew to compatibility
>>>handled? E.g., double-yod-plus-patah -> double-yod, patah
>>>(necessary for Yiddish).
>It must be converted to Yod Yod Patah to be viewable by browsers conforming
>to the Israeli standard.
This is a unacceptable, or at best, sloppy conversion. Yod (followed by
the yod with a patah underneath it) is not the same as a double-yod with a
patah underneath both yods. The problem is that the Israeli standard lacks
the double-yod character. The solution is to add this character, and
failing that, the best one can do is have sloppy conversions.
>The Yiddish characters should be canonicaly or compatibility decomposed to
>be useable. The current status of the Unicode data base is not clear to me.
>If it does not include these decompositions it must be fixed.
The double-yod cannot be decomposed further in Unicode. If it were, it
would lead to mistakes like binding the nonspacing mark (patah, etc.) to
the wrong character. This would lead to the wrong presentation form, i.e.,
a yod followed by a yod with a patah underneath it.
In the example given, double-yod-patah should be decomposed to double-yod,
patah, but no further. This can then be reversed back to double-yod-patah,
which is the only representation for this presentation form in MacOS Hebrew.
In Windows Hebrew (Code Page 1255), the double-yod exists as a character so
if your transcoded from MacOS Hebrew (double-yod-patah) -> Unicode
Canonical (double-yod, patah) -> Windows Hebrew (double-yod, patah), there
would be no problem.
If transcoding down to SII's Hebrew character set, there would have to be
information loss. For this reason, I suggest that the SII expand its
repertoire to include all the Unicode Hebrew block characters (main Hebrew
block, not the compatibility area), and that this standard be avoided for
Hebrew-alphabet applications until that happens.
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:40 EDT