FONT ID character

From: Glen Perkins (
Date: Tue Dec 07 1999 - 21:03:40 EST

Would it be appropriate to set aside a single codepoint in the BMP to be
used for a glyph that identifies the font being used (or has this already
been done, and I've just never noticed)? Would the REPLACEMENT CHARACTER
U+FFFD be a better candidate than assigning this to a currently unused

Many times, when doing i18n work, I have to tell some black box, a Windows
edit control for example, to switch to some special font in order to display
my inscrutable text, but when I see the result on screen, it's not using my
desired font, making it even more inscrutable than expected.

Sometimes this font error is obvious, but other times it's not. For example,
I might get missing glyph boxes instead, sending me off to do hexadecimal
pointer arithmetic or some other ritual sacrifice that programmers resort to
when trying to appease the gods. Often, the fault isn't in my stars or
bytes, but just that it's not using the font that I expected, but the
missing glyph box is so generic that I can't immediately tell. Other times
I'm seeing the characters I expected, but the reason the display is so
mangled is that this font is completely unsuited for this script system but
somehow got selected instead of my desired font.

Once it starts to dawn on me that maybe this isn't the font I'm thinking it
is, then the first question that arises is "well, what font IS it, then?"
Correctly identifying that font is often the first step toward fixing the

What I'd like to be able to do is add the U+XXXX (FONT ID) character to my
string and have the glyph for that character tell me what font it's from.
Some abbreviation of the font face name that's readable when displayed at a
large size, for example. It's just for debugging, so if it has to be
displayed at 72points to be readable on screen, that's usually okay. In
cases where you can't display it large enough to read it, you're no worse
off than if that glyph didn't exist at all.

I realize that this isn't always going to work, and may sometimes be
misleading. Sometimes the problem is that you have font switching/binding
going on, changing fonts automagically in mid-string, so this FONT ID
character is coming from one font while your problem characters are coming
from a different font. I realize that, but we have that problem anyway. A
lot of us have favorite quirky glyphs that we use as heuristics to
double-check the font, but that heuristic fails in cases like the one above.
In cases where the font isn't switching in mid-string, though, but is just
the wrong font (a more common scenario), or you just want to double-check
that the font is really what you think it is (most of the time), using a
glyph that is formally designed for this purpose would be a lot easier than
squinting at the glyphs, looking for clues as to their font's identity.

Whether that FONT ID character should be a separate character, or just take
the form of a suggestion to font makers that they make U+FFFD more
informative, is a separate issue.

Just a thought,
Glen Perkins

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:56 EDT