From: Mark E. Shoulson (mark@kli.org)
Date: Sun Jul 18 2004 - 21:25:42 CDT
Michael Everson wrote:
> At 13:00 +0300 2004-07-18, Jony Rosenne wrote:
>
>> > Jony is arguing to extend AccentFolding to Hebrew (fold to
>>
>>> unpointed). His
>>> suggestion is to fold *all* combining marks used with Hebrew
>>> in that case.
>>> I want to double check that he really means all combining
>>> marks in the
>>
>> > Hebrew block, or just some of them.
>>
>> I did mean all. All points and cantillation marks in Hebrew are
>> optional.
>
>
> In the Hebrew language, perhaps. But in other languages, like Yiddish,
> which use the Hebrew script, at least some points are NOT optional,
> and "dropping" them causes textual corruption and loss of data.
Mm, true. Though for all that, a lot of Yiddish I've seen is also
written without vowel-points. So the patah-alef and qamats-alef vowels,
and the yod-yod-patah vs. yod yod diphthongs, must be distinguished from
context, like everything else.
Even so, there's probably some language out there that requires some
diacritics left in place on Hebrew letters (I don't know much about
other languages written in Hebrew letters; Elain Keown knows that
better). But this folding is *supposed* to lose data. Even in Hebrew,
folding away all the vowels leaves something probably readable, but with
less actual information (e.g. foreign names or obscure words might not
be recoverable with 100% accuracy). And folding away diacritics of
Latin letters *certainly* causes data loss and textual corruption in
some languages. I was under the impression that losslessness was a
non-goal of this folding operation, and in fact Hebrew (and even
Yiddish) survives its scourge considerably better than a lot of
Latin-written languages.
~mark
This archive was generated by hypermail 2.1.5 : Sun Jul 18 2004 - 21:26:29 CDT