Re: Character identities

From: Doug Ewell (
Date: Mon Oct 28 2002 - 20:37:35 EST

  • Next message: Mark Davis: "Re: Character identities"

    My USD 0.02, as someone who is neither a professional typographer nor a
    font designer (more than one, but not quite two, different things)...

    Discussions about the character-glyph model often mention the "essential
    characteristics" of a given character. For example, a Latin capital A
    can be bold, italic, script, sans-serif, etc., but it must always have
    that essential "A-ness" such that readers of (e.g.) English can identify
    it as an A instead of, say, an O or a 4 or a picture of a duck. (Mark
    Davis has a chart showing dozens of different A's in his "Unicode Myths"

    Somewhere in between the obvious relationships (A = A, B ≠ A), we have
    the case pair A and a. They are not identical, but they are certainly
    more similar to each other than are A and B.

    It seems to me, as a non-font guy, that calling a font a "Unicode font"
    implies two things:

    1. It must be based on Unicode code points. For True- and OpenType
    fonts, this implies a Unicode cmap; for other font technologies it
    implies some more-or-less equivalent mechanism. The point is that
    glyphs must be associated with Unicode code points (not necessarily
    1-to-1, of course), not merely with an internal 8-bit table that can be
    mapped to Unicode only through some other piece of software.

    2. The glyphs must reflect the "essential characteristics" of the
    Unicode character to which they are mapped. That means a capital A can
    be bold, italic, script, sans-serif, etc. A small a can also be
    small-caps (or even full-size caps), but I think this is the only
    controversial point.

    In a Unicode font, U+0041 cannot be mapped to a capital A with macron,
    as it is in Bookshelf Symbol 1; nor to a six-pointed star, as in
    Monotype Sorts; nor to a hand holding up two fingers, as in Wingdings.
    (But it can be mapped to a "notdef" glyph, if the font makes no claim to
    supporting U+0041.)

    U+0915 absolutely can have snow on it, or be bold or italic or whatever
    (or all of these), as long as a Devanagari reader would recognize its
    essential "ka-ness." It cannot look like a Latin A, nor for that matter
    can U+0041 look like a Devanagari ka.

    Font guys, do you agree with this?

    Of course, the term "Unicode font" is also often used to mean "a font
    that covers all, or nearly all, of Unicode." Font technologies
    generally don't even allow this, of course, and even by the standards of
    "nearly" we are still limiting ourselves to things like Bitstream
    Cyberbit, Arial Unicode MS, Code2000, Cardo, etc. Right or wrong, this
    is a commonly accepted meaning for "Unicode font."

    -Doug Ewell
     Fullerton, California

    This archive was generated by hypermail 2.1.5 : Mon Oct 28 2002 - 21:30:51 EST