Re: Question about Unicode Ranges in TrueType fonts

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Thu Jun 26 2003 - 14:34:17 EDT

  • Next message: John Cowan: "Re: Nightmares"

    On Thursday, June 26, 2003 4:13 PM, Andrew C. West <andrewcwest@alumni.princeton.edu> wrote:

    > On Thu, 26 Jun 2003 14:26:13 +0200, "Philippe Verdy" wrote:
    >
    > > Isn't there a work-around with the following function (quote from
    > > Microsoft MSDN):
    > > (with the caveat that you first need to allocate and fill a Unicode
    > > string for the
    > > codepoints you want to test, and this can be lengthy if one wants
    > > to retreive the full list of supported codepoints).
    > > However, this is still the best function to use to know if a string
    > > can effectively
    > > be rendered before drawing it...
    > >
    > > _*GetGlyphIndices*_
    > >
    >
    > GetGlyphIndices() or Uniscribe's ScriptGetCMap() would be OK for
    > checking coverage for small Unicode blocks such as Gothic (27
    > codepoints) or even Mathematical Alphanumeric Symbols (992
    > codepoints), but I suspect your application would freeze if you tried
    > to use it to work out exact codepoint coverage of CJK-B (42,711
    > codepoints) and PUA-A and PUA-B (65,534 codepoints each).

    That's why I added the comment. For an effective application however,
    this is a great way to check if a given text will be effectively displayed.

    If not, one can use other Uniscribe functions to perform additional
    mappings, and if this fails, one can add another TrueType font to a
    logical font, by selecting among those that have a script bit set in
    their descriptors. The application may propose to users to select
    a prefered order for all fonts having a script bit set in this descriptor.

    Then the application will create a logical font for that script using
    this preference order. But if there's no font in the collection that
    contains the glyph, there will beno other choice than displaying
    the substitution glyph of the first font (such as a rectangle bullet)
    normally bound to U+FFFD unless the font descriptor specifies
    a specific glyph.

    Other strategies are for the application to create one logical font
    per language, if the text to render is labelled (out-of-band) with a
    language indicator. This gives more coherent results than creating
    a logical font per supperted script, notably on Latin-based
    languages with many characters such as Vietnamese...

    So if a markup language specifies a font family, the font stack
    will include this family on top of the stack, followed by the fonts
    for the language+script combination, followed by the fonts
    for a particular script, and followed then by all prefered fonts
    for any scripts, and finally followed by all other fonts.

    -- Philippe.



    This archive was generated by hypermail 2.1.5 : Thu Jun 26 2003 - 15:12:39 EDT