Handling non-ligation points in text processors (was: CGJ for Two Greek Ligatures?)

From: Otto Stolz (Otto.Stolz@uni-konstanz.de)
Date: Mon Mar 07 2005 - 12:26:27 CST

  • Next message: Christopher Fynn: "Re: Languages using multiple scripts"

    Doug Ewell schrieb:
    > German and other languages where ligation depends on syllable or subword
    > boundaries might apply dictionary lookup,

    For languages where ligation depends on subword boundaries (e. g., German),
    MS Word should treat the pertinent marker (“bedingter Trennstrich”, in
    the German user interface; I don’t know the English equivalent; the hot
    key is “Ctrl” + “-”) as both a possible hyphenation point and as a boundary
    across which no ligation must take place. This could be easily implemented
    by inserting a ZWNJ at every hyphenation point entered by the user, when
    the text is sent to a rendering process, or stored in non-Word format.

    This would make proper ligation feasable without dictionary lookup.

    Of course, this also applies to other text processing software; it is just
    so that I know MS-Word better.

    Rationale:

    In German, subword boundaries are the preferred hyphenation points:
    <http://www.ids-mannheim.de/reform/f.html#P111>. Authors, and typists,
    are supposed to prevent misleading hyphenations:
    <http://www.ids-mannheim.de/reform/f.html#111E2>. Hence, many typists
    have made a habit of marking the subword boundaries with “Ctrl” + “-”
    (at least, I have done so). Only in very narrow columns, one would allow
    a hyphenation at a non-subword (but syllable) boundary, as those additional
    hyphenation points tend to mislead the reader. Example, actually seen in
    “Das große Haus- und Familienbuch der Spiele” by Robert E Lembke: “Radiosen-
    dung” – never heard of “Radiosen”, they are not even in the encyclopaedia,
    what might their dung look like – oh, now I see, that guy is discussing
    a “Radio-Sendung” (radio broadcast).

    Wrong ligation is much less conspicuous than wrong hyphenating, and many
    fonts don’t do ligation, anyway; hence, it would be difficult to educate
    typists to include explicit ZWNL to prevent wrong ligations.

    Furthermore, there is no reason to mark, on data entry, subword boundaries
    twice: for hyphenation, and for non-ligation, respectively. On the contrary,
    there is orthographic, and typographic, reason to mark them just once
    (for both hyphenating and non-ligation).

    Word knows which parts of a document are in German; the rendering process
    (or a plain text file) has no information about the language (or not
    normally). Hence, Word should apply this information to enable proper
    ligatures, if the font has the pertinent glyphs.

    Best wishes,
       Otto Stolz



    This archive was generated by hypermail 2.1.5 : Mon Mar 07 2005 - 12:28:26 CST