From: Philippe Verdy (email@example.com)
Date: Mon Nov 29 2004 - 11:52:15 CST
From: "Otto Stolz" <Otto.Stolz@uni-konstanz.de>
> Note that there is no algorithm to reliably derive the position of the
> syllable break from the spelling of a Word. You could even concoct pairs
> of homographs that differ only in the position of the syllable break
> (and, consequently, in their respective meaning). So far, I have only
> found the somewhat silly example
> - "Brief"+SYH+"lasche" (letter flap) vs.
> - "Brie"+SYH+"flasche" (bottle to keep Brie cheese in),
> but I am sure I could find better examples if I would try in earnest.
French hyphenation does not work reliably based only on orthographic rules.
It works wuite well, but with many exceptions, that require using an
hyphenation dictionnary. I think it's true also of almost all alphabet-based
languages, and even for some languages written with so-called "syllabic"
scripts, probably as a matter of style, where separate vocal syllables must
not be broken, as those breaks are not the best according to meaning
(notably for compound words).
The case of German is that there are many possible compound words, and
breaks preferably occur between radical words rather than between syllables,
- due to other stylistic constraints, or
- on short particles that should better not be detached from their
respective radical (but where do you best break the "hereinzugehen" or
simply "zugehen" verbs?),
- also because not all verb particles are detachable, as they belong to the
radical (many excamples with the "be" particle or radical prefix)
Even if you allow hyphenation only between lexical units, there will exist
some exceptions that can't be resolved without understanding the semantic.
Such compound words with no separator are extremely rare in English, and
very rare in French.
(French examples: there's a clear vocal syllable break in "millionce" after
"-li-" and before "-on-" prononced with separate vowels, but in "million",
no break occurs within "-lions" which is a single syllable, pronounced with
a diphtong; none of these examples are compound words.)
But hyphenation is still preferable in German than only word breaks (on
spaces), due to the average length of compound words, whose margin alignment
may look ugly and hard to read in narrow columns like in newspapers or in
dictionnaries. In Dutch, there's more freedom for the creation of compounds,
that can often be written with or without a separator (a modern Dutch style
prefers using separators, or not creating any compound, by using word
separation with space, but historically Dutch was using the German style
still in use today despite its possible semantic ambiguities).
I think that a German writer that sees a possible ambiguity will often
tolerate to use an unconditional hyphen to create compound words (in your
example, he would write "Brief-Lasche" or "Brie-Flasche" but not
"Brieflasche" whose interpretation is problematic because there's no easy
way to determine it even with the funny semantic of the two alternatives;
unless the author is sure that ligatures are correctly handled with a
ligature on "fl" for the interpretation as "Brie-Flasche", and no ligature,
and a narrow spacing, between f and l for the interpretation as
(Historically, German texts were full of ligatures -- much more often than
in other Latin-based written languages -- those ligatures tending now to
disappear from most modern publications; with the German rule that a
ligature should not occur between two syllables, and should be present
within the same radical, it's easy to see how ligatures are part of the
orthographic system and that they have a semantic value which helps the
correct understanding of text, so it would be even more important to use
ZWNJ or ZWJ in German words, and not letting a renderer do this job
automatically but inaccurately; for simplicity, I think that ZWNJ inserted
between radicals to avoid their ligature would be easier to manage than ZWJ
between two ligaturable letters that must be kept in the same syllable).
This archive was generated by hypermail 2.1.5 : Mon Nov 29 2004 - 14:44:26 CST