Re: Hyphen

From: Asmus Freytag (asmusf@ix.netcom.com)
Date: Sat Jan 17 2009 - 12:22:07 CST

  • Next message: Mark Davis: "Re: Obsolete characters"

    On 1/17/2009 2:54 AM, Jukka K. Korpela wrote:
    >
    > I don’t think any software should implement UAX #14 as such, except
    > programs specifically designed to test the effects of UAX #14. It is
    > absurd, for example, to break the expression “-1” after the
    > HYPHEN-MINUS character.
    Nobody argues that point. UAX#14 is conceived of as a baseline from
    which to customize to get better results.

    That doesn't explain why Microsoft chose to disregard the very clear
    semantics of HYPHEN. That seems very much like a bug, and unfortunately,
    has the effect of making text where HYPHEN-MINUS is replaced by the more
    appropriate characters MINUS and HYPHEN work less well than retaining
    the undifferentiated HYPHEN-MINUS. That's definitely contrary to the
    expectations of those who asked for the encoding of HYPHEN early in the
    history of Unicode.

    HYPHEN and MINUS (and EN DASH) were introduced to allow authors to
    unambiguously encoded whether "-123" is a negative number, or something
    that can be wrapped to a new line before the 123.

    It is because of the random, and occasionally spotty nature of some of
    the early support of Unicode that specifications such as UAX#14 are even
    necessary. If everyone got it right, there would be no need for it.

    A./

    PS: "-1" is very short. A smart line-layout algorithm would not accept
    such a line break, even if it is a formal line-break opportunity. That's
    true whether the example is "-1" or "-a". Wrapping a single character to
    a new line after a hyphen does not look good, but Word (2003) does it
    anyway (after HYPEHN-MINUS).

    According to UAX#14, a line-layout algorithm gets to decide which line
    break opportunities to use - UAX#14 is designed to provide the
    candidates, but not the algorithm for selection of the actual line
    breaks. That is often confused in discussing UAX#14.



    This archive was generated by hypermail 2.1.5 : Sat Jan 17 2009 - 12:25:34 CST