From: Richard Wordingham (email@example.com)
Date: Mon May 01 2006 - 04:32:21 CST
John H. Jenkins wrote on Monday, May 01, 2006 at 3:54 AM
>> In most applications, if a character is requested that isn't
>> available in the current font, most applications will instead
>> display an empty box. Essentially what I'm wondering is whether
>> this box symbol is application specific, or font specific, and in
>> the latter case, what the character symbol is.
> It's actually font-specific (at least in TrueType/OpenType fonts).
> It typically corresponds to no character in the font; it's a
> character-less glyph.
I'll expand on this, as fonts can contain many character-less glyphs, of
The basic point is that the glyphs in a font in the OpenType format are
identified by number, and glyph 0 is the one to be used when there is no
proper glyph for a character. (Postscript fonts work by name, and the
corresponding glyph is identified by the name '.notdef'.) This glyph is
sometimes known as the 'missing glyph'.
Fonts defined in the OpenType format have a 'table' called the 'cmap' table,
which converts from character code to glyph. Actually, there may be several
sets of look-up tables, for different coding systems (even UCS-2 v.
full-range Unicode) and different platforms - for example, the Apple logo
should not be accessible on a Windows platform! This is not the end of the
story, for the conversion may be further refined (in accordance with data
tables stored in the font), e.g. to support all the complexities of Indic
and Arabic shaping. To give a Latin script example, the width of a macron
may depend on what it is placed above, with different glyphs for different
widths. In this case there will be multiple glyphs all corresponding, in a
sense, to U+0304, though the cmap will map U+0304 to a specific glyph.
(This dependence is a necessity for a combining overline, U+0305, if it is
to be handled by normal mechanisms.)
For simplicity in the construction of the cmap, some characters may actually
be mapped to the missing glyph. In these days of font substitution, I
suspect this is a bad idea.
Finally, there may be other character-less glyphs that simply cannot be
accessed via character codes at all.
This archive was generated by hypermail 2.1.5 : Mon May 01 2006 - 04:39:30 CST