From: Doug Ewell (dewell@adelphia.net)
Date: Thu Jan 16 2003 - 11:40:21 EST
I've got a lot less to write since everybody else got there first.
Christoph Päper <christoph dot paeper at tu dash clausthal dot de>
wrote:
> I recently learned in <news:de.etc.sprache.deutsch> that there has
> been a tradition (in handwritten text more than in print) of writing
> "mm" as only one "m" with a macron above. I can't find any such
> character in Unicode, just U+1E3F and U+1E41.
Assuming that you want to encode the m-macron directly--rather than
encoding "mm" and letting a German-handwriting-specific rendering system
convert this to m-macron, as Ken suggested--the correct solution would
be to use a combining sequence, "m" followed by U+0304 COMBINING MACRON.
I suppose you could use U+0305 COMBINING OVERLINE instead, but the
decision of which mark to use should be based on whether the mark really
is a macron or an overline, not on the width of the glyph. U+0304
already has to adjust its width depending on whether it appears over an
"i" or an "a".
> You could of course build something similar with "m"+U+0305 to
> resemble the look, but that won't become "mm" (just "m" or "m¯") after
> a conversion to e.g. ISO-8859-1.
Two important points here. First, a combining sequence doesn't simply
"resemble the look" of a precomposed character; it is *completely
equivalent* to the precomposed character. If you wanted to represent an
"a" with macron, which does exist in a precomposed form, you would be
just as correct using either U+0101 or a combination of U+0061 and
U+0304 (though normalization might require you to choose one or the
other; see Unicode Standard Annex #15).
Second, no Unicode character that is not already in (e.g.) ISO 8859-1 is
ever "automatically" converted to an 8859-1 character. You will always
have to have some explicit mapping table or logic to perform such a
conversion. This is just as true for a precomposed character as it is
for a combining sequence. If you wanted to build a conversion layer to
convert between "m̄" and "mm" you could certainly do so.
-Doug Ewell
Fullerton, California
This archive was generated by hypermail 2.1.5 : Thu Jan 16 2003 - 12:24:00 EST