Re: Proposing a DOUBLE HYPHEN punctuation mark

From: Asmus Freytag (
Date: Tue Jan 23 2007 - 05:05:16 CST

  • Next message: Asmus Freytag: "Re: Proposing UTF-21/24"

    On 1/23/2007 1:05 AM, Jukka K. Korpela wrote:
    > On Mon, 22 Jan 2007, Doug Ewell wrote:
    >> I always thought the convention of using a double hyphen to indicate
    >> line-splitting hyphenation at a point where lexeme-joining
    >> hyphenation would have occurred anyway was a simply brilliant idea,
    >> one I wish were in more widespread use.
    > It would indeed be useful to make such a distinction, at the character
    > level, at the glyph level, or both. In text processing, it would be
    > relevant to know whether a word (or other expression) actually
    > contains a hyphen or there's just a hyphen at the end of a line to
    > indicate continuation of the word on the next line.
    > I'm biting my tongue to avoid saying that the soft hyphen character
    > was, at least in some people's interpretations, meant to act as
    > line-splitting hyphenation character but then turned into a
    > discretionary hyphen.
    If you add non-obvious features to a standard, you have to make sure
    that information on how to use them is widely disseminated. Early 8-bit
    standards had to be ordered as paper copies which meant that what most
    people referenced were re-creations of just the character layout part.
    Early standards also tended to not contain a lot of information on how
    characters were to be used, or explicitly allowed competing usage
    conventions (for control codes).

    Given how few characters are in each 8-bit standard, the number of
    characters where there is an associated uncertainty of what they were
    meant to encode (and this includes some printable characters as well)
    has proven astonishingly large....
    > Anyway, Unicode is about characters that are used, rather than
    > characters that should be used. On the other hand, this is a chicken
    > and egg problem these days. When most texts are written using
    > computers and appear in digital form, thereby inevitably using encoded
    > characters, there is little room for introducing new characters.
    As long as we are clear that a double hyphen is not intended to be used
    when the font style requires a doubled shape for the standard hyphen
    (and we have now sorted that out, finally), there's no longer a reason
    to prevent people from making an explicit distinction. Whether that kind
    of distinction will ever become mainstream, I don't know, but for my
    case, I'm now convinced that there are enough special needs for one of
    these things that it's time to add it.

    I'm firmly opposed to the idea that the main purpose here is to encode a
    specific semantic. That would have be done by the rules of the
    orthography in which this character is used. In other words,
    CONTINUATION HYPHEN would be inappropriate as a character name.

    The character code should simply serve two purposes:
     1) allow a distinction between the new character and standard hyphen
     2) request a double stroke glyph

    We should also establish (more clearly than we have, perhaps in the
    past) that font designs that use a slanted form for the hyphen do not
    need to encode a separate character for a short slanted dash, but use
    the character code for hyphen; font designs that use a slanted form for
    the hyphen, should use a slanted form for the double hyphen; and
    finally, some fonts, such as Fraktur, will use a slanted double stroke
    form for the standard hyphen.

    That is the appropriate set of glyph variations for standard
    (non-decorative) fonts. It will be obvious, that with a Fraktur font you
    can neither support the double hyphen, nor the oblique double hyphen
    *characters*, as they would not be rendered with any distinction. That's
    fine, and is not a requirement.

    Similar, font styles that use slanted forms for any hyphen (double and
    single) cannot be used for those notations that need the oblique double
    hyphen (that's also fine).

    However, I would expect a font that uses a standard hyphen glyph to
    support a double hyphen with a standard double hyphen glyph, so that it
    can be used most widely. (Apart from the fact that it will take a while
    before a newly proposed character can be encoded and then much later

    This scheme is not so different from support for specialized
    distinctions needed for IPA or mathematical use. Fonts intended for such
    uses must accept a restriction of the glyph variation for certain common
    characters, in order to retain a visual distinction with another, more
    specialized character. Fonts for ordinary users are not so constrained
    and can be more fanciful or varied in their glyph choices.


    This archive was generated by hypermail 2.1.5 : Tue Jan 23 2007 - 05:07:13 CST