Revised proposal for "Missing character" glyph

From: Carl W. Brown (
Date: Fri Aug 16 2002 - 13:02:38 EDT

Proposed unknown and missing character representation. This would be an
alternate to method currently described in 5.3.

The missing or unknown character would be represented as a series of
vertical hex digit pairs for each byte of the character. BMP characters
would be represented with 4 hex digits or two pairs of hex digits. Plane
1-16 characters would be represented as 6 digits or 3 pairs of digits.
Garbage data with non-zero bits 24-31 may require 8 digits or 4 pairs of

This representation would be recognized by untrained people as unrenderable
data or garbage. So it would serve the same function as a missing glyph
character except that it would be different from normal glyphs so that they
would know that something was wrong and the text did not just happen to have
funny characters.

It would aid people in finding the problem and for people with Unicode books
the text would be decipherable. If the information was truly critical they
could have the text deciphered.

The missing character glyphs will be best rendered as a series of glyphs by
a font engine capable of glyph positioning. If that is not possible it
could also be rendered by displaying a fractional space followed by a set of
two to three hex pair glyphs for each character byte follows by another
fractional space. This would require 256 glyphs for the vertical hex pairs
and a fractional space glyph.

This proposal would provide a standardized approach that vendors could adopt
to clarify missing character rendering and reduce support costs. By
including this in the standard we could provide a cross vendor approach.
This would provide a consistent solution.

This archive was generated by hypermail 2.1.2 : Fri Aug 16 2002 - 11:03:20 EDT