RE: Revised proposal for "Missing character" glyph

From: Carl W. Brown (cbrown@xnetinc.com)
Date: Mon Aug 26 2002 - 22:42:13 EDT


Ken,

The little square boxes do not help much if you what to know exactly what
the missing characters are. I do however feel that any solution to the
problems should be Unicode based. If left to the vendors that may display
the code page characters and you are guessing again.

The tool idea is great but I do not see how it could be embedded in the OS
without changing the application. It will also require user training.

I think that as we move away from code page text we will find that the next
big problem will be characters that are missing from the font or sets of
fonts. The trick will be to change the set of fonts. This might require
trial and error if we do not have good diagnostic tools.

Implementing this change will probably be easier that using the special
symbols for the script which will also require special handling and many not
catch all errors. This approach will also allow critical test that can not
be redisplayed to be deciphered.

This has been a pet peeve of mine having used the Fujitsu Shift JIS solution
and seen it work in a real live situation.

Carl

> -----Original Message-----
> From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org]On
> Behalf Of Kenneth Whistler
> Sent: Monday, August 26, 2002 2:01 PM
> To: unicode@unicode.org
> Cc: kenw@sybase.com
> Subject: Re: Revised proposal for "Missing character" glyph
>
>
> [Resend of a response which got eaten by the Unicode email
> during the system maintenance last week. Carl already responded
> to me on this, but others may not have seen what he was
> responding to. --Ken]
>
>
> > Proposed unknown and missing character representation. This would be an
> > alternate to method currently described in 5.3.
> >
> > The missing or unknown character would be represented as a series of
> > vertical hex digit pairs for each byte of the character.
>
> The problem I have with this is that is seems to be an overengineered
> approach that conflates two issues:
>
> a. What does a font do when requested to display a character
> (or sequence) for which it has no glyph.
>
> b. What does a user do to diagnose text content that may be
> causing a rendering failure.
>
> For the first problem, we already have a widespread approach that
> seems adequate. And other correspondents on this topic have pointed
> out that the particular approach of displaying up hex numbers for
> characters may pose technical difficulties for at least some font
> technologies.
>
> [snip]
>
> >
> > This representation would be recognized by untrained people as
> unrenderable
> > data or garbage. So it would serve the same function as a missing glyph
> > character except that it would be different from normal glyphs
> so that they
> > would know that something was wrong and the text did not just
> happen to have
> > funny characters.
>
> I don't see any particular problem in training people to recognize when
> they are seeing their fonts' notdef glyphs. The whole concept of "seeing
> little boxes where the characters should be" is not hard to explain to
> people -- even to people who otherwise have difficulty with a lot of
> computer abstractions.
>
> Things will be better-behaved when applications finally get past the
> related but worse problem of screwing up the character encodings --
> which results in the more typical misdisplay: lots of recognizable
> glyphs, but randomly arranged into nonsensical junk. (Ah, yes, that
> must be another piece of Korean spam mail in my mail tray.)
>
> >
> > It would aid people in finding the problem and for people with
> Unicode books
> > the text would be decipherable. If the information was truly
> critical they
> > could have the text deciphered.
>
> Rather than trying to engineer a questionable solution into the fonts,
> I'd like to step back and ask what would better serve the user
> in such circumstances.
>
> And an approach which strikes me as a much more useful and extensible
> way to deal with this would be the concept of a "What's This?"
> text accessory. Essentially a small tool that a user could select
> a piece of text with (think of it like a little magnifying glass,
> if you will), which will then pop up the contents selected, deconstructed
> into its character sequence explicitly. Limited versions of such things
> exist already -- such as the tooltip-like popup windows for Asmus'
> Unibook program, which give attribute information for characters
> in the code chart. But I'm thinking of something a little more generic,
> associated with textedit/richedit type text editing areas (or associated
> with general word processing programs).
>
> The reason why such an approach is more extensible is that it is not
> merely focussed on the nondisplayable character glyph issue, but rather
> represents a general ability to "query" text, whether normally
> displayable or not. I could query a black box notdef glyph to find
> out what in the text caused its display; but I could just as well
> query a properly displayed Telugu glyph, for example, to find out what
> it was, as well.
>
> This is comparable (although more point-oriented) to the concept of
> giving people a source display for HTML, so they can figure out
> what in the markup is causing rendering problems for their rich
> text content.
>
> [snip]
>
> > This proposal would provide a standardized approach that
> vendors could adopt
> > to clarify missing character rendering and reduce support costs. By
> > including this in the standard we could provide a cross vendor approach.
> > This would provide a consistent solution.
>
> In my opinion, the standard already provides a description of a
> cross-vendor
> approach to the notdef glyph problem, with the advantage that it is
> the de facto, widely adopted approach as well. As long as font
> vendors stay
> away from making {p}'s and {q}'s their notdef glyphs, as I think we can
> safely presume they will, and instead use variants on the themes
> of hollowed
> or filled boxes, then the problem of *recognition* of the notdef glyphs
> for what they are is a pretty marginal problem.
>
> And as for how to provide users better diagnostics for figuring out the
> content of undisplayable text, I suppose the standard could suggest some
> implementation guidelines there, but this might be a better area to just
> leave up to competing implementation practice until certain user interface
> models catch on and get widespread acceptance.
>
> --Ken
>
>



This archive was generated by hypermail 2.1.2 : Mon Aug 26 2002 - 21:02:54 EDT