From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Mon Nov 05 2007 - 09:41:27 CST
Jeroen Ruigrok van der Werven wrote:
> German is like Dutch, for all I know, in the ways you hyphenate words
> (typically on a syllable basis). And the way you hyphenate determines
> whether or not you can use ligatures or not. So I have to agree with
> Werner here, the syllable boundary stops the ligature formation.
Not really. The syllable boundary stops ONLY SOME ligature formations. It
does not prevent for example the "ff" ligature, because the two letters are
(almost) always occurring over a syllable break (except at end of a word).
The "ff" ligature is an example where ligation occurs for aesthetical
reasons.
In fact, the rule that determines if syllable break are disallowed is based
on radicals, not on syllables: in a compound word, it doesnot matter if a
radical is multisyllabic, as ligatures are permitted everywhere in the
radical, but not across a radial boundary. The rule however is constrained
by the type of ligature:
* the ess-tsett, for example, is a ligature that obeys additional
constraints, where syllable breaks are still significant, and not just the
radical break; for this reason, no ess-tsett is allowed in the verb
"müssen", despite there's only one radical (but two syllables), but it is
preferred in the "muss" conjugated form (but not in Swiss German where the
ligature is now deprecating).
* in "Straffen", the "ff" ligature is permitted, because there's only one
radical, even if there are two syllables. The syllable break does not
prevent the ligature, because there's no compound word in German where a
single final "f" of a radical is followed by another "f" from the initial of
the next radical (the reason is that there's normally no single "f" at end
of a radical, it is always a double "f" in this case, and the ligature is
permitted there).
A double f ligature may occur in final position of a radical, but the
radical may still have derived forms with plural or genitive suffixes or
conjugated suffixes without loosing the ligature. BUT the ligature is still
hyphenatable (if the hyphenation respects other typographical rules such as
not isolating a few letters on a line, this minimum number of letters
varying according to style, but being typically at least 3).
In French the situation is similar, ligaturing the "ae" letters is almost
always permitted, but there are similar exceptions, and it is the radical
break that prevents the ligature, not the syllable break (The city of
"Caen", one syllable in modern French, is an exception for etymological
reasons: the "ae" ligature would have the phonetic value of "e" and this
would change the pronunciation of the leading "C" if the "a" was not kept).
The same thing is applicable to the ligature of "oe": the ligature is
permitted (and normally mandatory) each time the "o" is silent or does not
break the radical, but it is forbidden if there's a radical break (the
silent "o" was kept for etymological reasons, sometimes not justified, but
it is justified in a word like "cur" to prevent pronouncing it like in
"minceur" with the vocal mutation of "c" into "s").
In other words, the set of rules for allowing or forbidding a ligature is
not only specific to each language, but also within each language, specific
to each ligature. That's why I don't think a any general purpose renderer
can appropriately infer the presence or absence of ligatures: this requires
the help of language identification, knowledge of the language-specific
radicals and allowed mutations and suffixes used for derived forms.
This job is part of a spellchecker, that must help the renderer by inserting
ligature hints (ZWJ) or using a dedicated character (the "ae letter" in
Unicode is a ligature in some languages) everywhere ligatures are permitted
(and preferable or sometimes mandatory). A renderer alone should NOT take
the decision of creating a ligature (and it should not be required to use
ZWNJ in texts to prevent a renderer of doing this: ZWNJ will be used to
prevent only a spell-checker to suggest a ligature, and ZWNJ should be
completely ignored by renderers).
This archive was generated by hypermail 2.1.5 : Mon Nov 05 2007 - 09:53:21 CST