RE: Proposal (was: "Missing character" glyph)

From: Carl W. Brown (cbrown@xnetinc.com)
Date: Sun Aug 04 2002 - 12:37:10 EDT


With a bit more thought we might reduce the minimum point size of an
unrenderable character as follows:

The numbers represent a dot position of that bit is a one. It is blank if
the bit is 0.

The XX characters are lines with an inverted wide squared U at the top with
the edges coming down to align with the first row of dots. The bottom line
is a wide squared U with the edges aligning with the 4th row of dots (bits 4
to 7). These lines distinguish the glyph and help the eye determine the
dot positioning.

Plane 0-15

      XXXXXXXXXXXXXXXX
     X 19 18 17 16 X
       15 14 13 12
       11 10 09 08
     X 07 06 05 04 X
     X 03 02 01 00 X
      XXXXXXXXXXXXXXXX

Plane 16

The top is an inverted wide square U with a straight line between it end
points instead of a row of dots.

      XXXXXXXXXXXXXXXX
     XXXXXXXXXXXXXXXXXX
       15 14 13 12
       11 10 09 08
     X 07 06 05 04 X
     X 03 02 01 00 X
      XXXXXXXXXXXXXXXX

Invalid Plane

The top is a pair of rectangles.

     XXXXXXXX XXXXXXXX
     XXXXXXXX XXXXXXXX
       31 30 29 28
       27 26 25 24
     X 23 22 21 20 X
     X 19 18 17 16 X
      XXXXXXXXXXXXXXXX

There are many ways to implement this but the principle is to provide a
unique glyph for each different unrenderable character that can be trace to
the code point.

If there has to be changes to the font engines, I do not think that they
will be major.

This can be a suggested standard as a alternative to the current 5.3 for
those systems that can support it. This way systems can migrate as they are
able.

Since there are now unique characters it now raises a BIDI issue. I think
that you have to display them in the stored order left to right. Otherwise
it will get sticky if the characters are between RTL and LTR text or the
other way around. Beginning, initial, final and stand alone issues should
not be as important since they should still render recognizable characters.
Languages like Southeast Asian and Indic scripts that depend on multiple
characters producing a composite glyph may be more affected.

Carl



This archive was generated by hypermail 2.1.2 : Sun Aug 04 2002 - 10:44:05 EDT