Re: double hyphen

From: Jukka K. Korpela (
Date: Mon Mar 07 2005 - 08:36:25 CST

  • Next message: Mark Davis: "Re: Unicode abuse"

    On Mon, 7 Mar 2005, Michael Everson wrote:

    > a "decree" that hyphenated family-names should no longer be
    > spelled with hyphens but now with an en-dash or em-dash does not have
    > to do with encoding.

    That is correct, though it might have some impact on considerations on
    properties of characters. The line breaking rules, for example, reflect
    some ideas of common use of characters and even combinations of
    characters, and if the actual usage changes considerably, maybe some
    reconsideration is needed.

    But what matters in encoding is how the characters have been specified.
    Unfortunately many standards and recommendations that prescribe the use of
    special characters do not identify them by Unicode numbers or names or in
    any other unique manner. When the normative version of a norm is a printed
    document, it can be impossible to decide what character is meant.
    All we have got is a particular glyph instance. For example, what is the
    dot-like character used in multiplication of units in the SI? People
    commonly encode it as the middle dot, but it would more logically be the
    dot operator.

    Similarly, what have the French officials really decided? I had understood
    that the rules say that two consecutive hyphens be used. (That would be
    somewhat vague too, since they might not have considered the differences
    between hyphen-minus, minus, and nonbreaking hyphen.) But is it really the
    en dash, or the dash, or just some dash-like character (pair?) of
    unspecified length and identity?

    Jukka "Yucca" Korpela,

    This archive was generated by hypermail 2.1.5 : Mon Mar 07 2005 - 08:37:34 CST