RE: Question about Unicode Ranges in TrueType fonts

From: Kenneth Whistler (kenw@sybase.com)
Date: Thu Jun 26 2003 - 15:26:07 EDT

  • Next message: Philippe Verdy: "Re: Question about Unicode Ranges in TrueType fonts"

    Elisha Berns asked:

    > It would appear from your answer that even after implementing the
    > algorithm to search the Unicode block coverage of a font, the actual
    > comparison "data", that is which blocks to compare and how many code
    > points, is totally undefined. Is there any kind of standard for
    > defining what codepoints are required to write a given language? This
    > seems like the issue that fontconfig gets around by using all those
    > .orth files which define the codepoints for a given language. But is
    > there any standardized set of language required codepoint definitions
    > that could be used?

    Not a standard that I know of, but there are a number of compilations
    of what *characters* are required for the alphabets of various
    languages. See, for example:

    http://www.evertype.com/alphabets/index.html

    for European languages. From each list of characters it is fairly
    straightforward to derive what Unicode encoded characters would
    be required to support that list.

    http://www.eki.ee/itstandard/ladina/

    is another source. This goes a little further afield into languages
    using Cyrillic characters, and also provides information about
    Unicode encodings directly.

    Note that for any such listing, you still need to take into account
    what punctuation or other characters might also be needed for
    the language's conventional orthography/ies, since the typical
    listing you will find is only for the alphabetic characters used
    by the language.

    >
    > Anyways, where is the up-to-date list of Unicode blocks to be found?

    http://www.unicode.org/Public/UNIDATA/Blocks.txt

    >
    > It's odd to think that the old way of using Charset identifiers in fonts
    > worked a lot more cleanly for finding fonts matching a language/language
    > group. I would think this kind of core issue would be addressed more
    > cleanly by the font standard.

    Which font standard?

    And this is an area where implementation strategies still seem to
    be in ferment. At some point this may settle down and be the
    subject of standardization, but premature standardization can
    also be a problem if the wrong choices get codified too soon.

    --Ken



    This archive was generated by hypermail 2.1.5 : Thu Jun 26 2003 - 16:05:01 EDT