RE: script or block detection needed for Unicode fonts

From: Murray Sargent (
Date: Sat Sep 28 2002 - 16:19:58 EDT

  • Next message: David Starner: "Re: script or block detection needed for Unicode fonts"

    Michael Everson said:
    > I don't understand why a particular bit has to be set in
    > some table. Why can't the OS just accept what's in the font?

    The main reason is performance. If an application has to check the font
    cmap for every character in a file, it slows down reading the file.
    Accordingly programs typically check the script bits of a font to see if
    the font claims to support a script. If so, the font is accepted. Else
    another font that has the appropriate bit is accepted. This info is
    cached, so it's very fast.

    A problem occurs more often with fonts that claim to support say Greek
    or Cyrillic, but only support the most common characters in these
    scripts. In RichEdit we now check the cmap for the less common Greek,
    Cyrillic, Arabic, etc., characters to ensure that they are in fact in
    the font. If not, we switch to some other font that has them.

    The problem with a font setting a script bit when the font only has a
    single glyph is that that font may then be used for other common
    characters in the script, thereby resulting in a missing-character glyph
    at display time.

    I suppose one could have it both ways by instructing a program to always
    check the cmap for a given font, thereby bypassing the more streamlined
    algorithms. This would be a handy option for specialized fonts. We'd
    need some font convention to turn on this behavior.


    This archive was generated by hypermail 2.1.5 : Sat Sep 28 2002 - 17:06:23 EDT