From: Peter Kirk (firstname.lastname@example.org)
Date: Sun Jul 18 2004 - 10:25:26 CDT
On 18/07/2004 12:51, Michael Everson wrote:
> At 13:00 +0300 2004-07-18, Jony Rosenne wrote:
>> > Jony is arguing to extend AccentFolding to Hebrew (fold to
>>> unpointed). His
>>> suggestion is to fold *all* combining marks used with Hebrew
>>> in that case.
>>> I want to double check that he really means all combining
>>> marks in the
>> > Hebrew block, or just some of them.
>> I did mean all. All points and cantillation marks in Hebrew are
> In the Hebrew language, perhaps. But in other languages, like Yiddish,
> which use the Hebrew script, at least some points are NOT optional,
> and "dropping" them causes textual corruption and loss of data.
The same is of course true of accent removal in Latin script, in many
European languages. The general accent folding, like DUCET, has to make
the best compromise between preferred usage in the most widely used
languages; or it can be tailored to the needs of specific languages.
Indeed in some sense every folding involves loss of data; that is the
nature of a folding. That doesn't stop generic accent removal being a
useful folding, in Latin and Hebrew scripts.
The question in one sense is whether accent and diacritic folding is a
graphical process or a logical one. If it is a logical process, it has
to take into account all sorts of potentially language-specific
variables such as the phonetic function of each combining mark. But it
makes more sense, within the scope of Unicode folding, for it to be
specified as a graphical process, the removal of auxiliary glyphs and
glyph modifiers from base characters without regard for their phonetic
effect or their status within the orthography of particular languages.
Anyway, is Yiddish in fact never written completely unpointed? That
would surprise me.
-- Peter Kirk email@example.com (personal) firstname.lastname@example.org (work) http://www.qaya.org/
This archive was generated by hypermail 2.1.5 : Sun Jul 18 2004 - 10:26:11 CDT