Re: Small Latin Letter m with Macron

From: Doug Ewell (dewell@adelphia.net)
Date: Thu Jan 16 2003 - 11:40:21 EST

  • Next message: Otto Stolz: "Re: Small Latin Letter m with Macron"

    I've got a lot less to write since everybody else got there first.

    Christoph Päper <christoph dot paeper at tu dash clausthal dot de>
    wrote:

    > I recently learned in <news:de.etc.sprache.deutsch> that there has
    > been a tradition (in handwritten text more than in print) of writing
    > "mm" as only one "m" with a macron above. I can't find any such
    > character in Unicode, just U+1E3F and U+1E41.

    Assuming that you want to encode the m-macron directly--rather than
    encoding "mm" and letting a German-handwriting-specific rendering system
    convert this to m-macron, as Ken suggested--the correct solution would
    be to use a combining sequence, "m" followed by U+0304 COMBINING MACRON.

    I suppose you could use U+0305 COMBINING OVERLINE instead, but the
    decision of which mark to use should be based on whether the mark really
    is a macron or an overline, not on the width of the glyph. U+0304
    already has to adjust its width depending on whether it appears over an
    "i" or an "a".

    > You could of course build something similar with "m"+U+0305 to
    > resemble the look, but that won't become "mm" (just "m" or "m¯") after
    > a conversion to e.g. ISO-8859-1.

    Two important points here. First, a combining sequence doesn't simply
    "resemble the look" of a precomposed character; it is *completely
    equivalent* to the precomposed character. If you wanted to represent an
    "a" with macron, which does exist in a precomposed form, you would be
    just as correct using either U+0101 or a combination of U+0061 and
    U+0304 (though normalization might require you to choose one or the
    other; see Unicode Standard Annex #15).

    Second, no Unicode character that is not already in (e.g.) ISO 8859-1 is
    ever "automatically" converted to an 8859-1 character. You will always
    have to have some explicit mapping table or logic to perform such a
    conversion. This is just as true for a precomposed character as it is
    for a combining sequence. If you wanted to build a conversion layer to
    convert between "m̄" and "mm" you could certainly do so.

    -Doug Ewell
     Fullerton, California



    This archive was generated by hypermail 2.1.5 : Thu Jan 16 2003 - 12:24:00 EST