Re: Fixed Width Spaces (was: Printing and Displaying DependentVowels)

From: Peter Kirk (peterkirk@qaya.org)
Date: Sat Apr 03 2004 - 07:03:42 EST

  • Next message: D. Starner: "Re: Fixed Width Spaces (was: Printing and Displaying DependentVowels)"

    On 02/04/2004 15:01, Asmus Freytag wrote:

    > ...
    >
    > Think of the example of SHY (soft hyphen), used to mark possible
    > hyphenation
    > points in a word. A while ago we had a discussion on this list where
    > there was
    > an interesting minimal pair of German compounds:
    >
    > Wachs|tu-be (tube of (or made of) wax)
    > Wach|stu-be (guard room)
    >
    > The word boundary (which is also an hyphenation point) is marked as |,
    > a secondary
    > hyphentaion point is marked with -. In other word, each word has two
    > SHYs in it,
    > but not both in the same location.
    >
    > I can remove the SHYs from these words, and if the text is not broken
    > across lines
    > at that point, its semantic for the human reader doesn't change. With
    > context, the
    > text is unambiguous, but if there isn't enough context, the text is
    > clearly ambiguous.
    >
    > However, equally clearly, by leaving the SHY in the text, it is (in
    > its internal
    > representation) entirely unambiguous, even if that semantic difference
    > is not
    > surfaced to the reader (except if a line break fortuitously happens to
    > be present
    > in the first half of the word).
    >
    > Of course a (good) screen reader could pick up on the difference and
    > split the
    > compound correctly when pronouncing it.
    >
    Interesting. But suppose the typesetting rules for German were changed
    so that hyphenation is no longer permitted, or so that (as in many
    languages) hyphenation points are determined strictly from the letters.
    These two words can no longer be distinguished by the position of SHY.
    But the good screen reader would still need to distinguish their
    pronunciations. Is there any type of character which could be defined,
    in Unicode, to preserve this distinction, but to be completely hidden in
    display? Perhaps some kind of zero width morpheme break character? I
    suppose ZWNJ or WJ could be used, but they might have other undesirable
    characteristics. (ZWNJ would inhibit formation of an st ligature in
    certain fonts in Wachs|tube, but maybe that is also desirable.)

    -- 
    Peter Kirk
    peter@qaya.org (personal)
    peterkirk@qaya.org (work)
    http://www.qaya.org/
    


    This archive was generated by hypermail 2.1.5 : Sat Apr 03 2004 - 07:34:31 EST