Re: Custom fonts (was: Tolkien wanta-be)

From: Pim Blokland (
Date: Sun Mar 16 2003 - 08:18:53 EST

  • Next message: David J. Perry: "RE: Normalisation and Greek characters"

    Chris Jacobs schreef:

    > Mortbats code point 0034 is CANCER
    > Arial Unicode MS code point 0034 is DIGIT FOUR
    > Arial Unicode MS code point 264B is CANCER

    No. First of all, this is the wrong example. This has got nothing to
    do with private use characters. Cancer is not a private use
    I don't know the Mortbats font, but if this font has been designed
    in accordance with the rules, it may have codepoint U+264B at index
    #34. This should not cause problems or inconsistencies for the
    display system.
    Secondly, the problem with the PUA is that it should not, and will
    not, be subjected to regulations and guidelines. Font designers are
    always free to put anything they want in there - characters,
    transcoding hints, combining accents, what have you. That is what
    the PUA is there for!

    However, let's take a look at what you really want.
    Suppose we have two custom fonts, A and B, both with 256 (custom)
    characters, and you want to free yourself of the problems caused by
    any overlapping codepoints they may have.
    Do you want to be able to tell the system that if you output
    character U+E000, for example, it should use font A, and if you
    output character U+E100, it should use font B?
    What exactly is the use of this?
    With a system like this, it would be impossible for, say, text files
    or HTML files on the Internet to display characters like this.
    Because what would you put in there to output, say, a Tinco? The
    writer of the HTML file doesn't know at what codepoint offset you
    have installed this Tengwar font.

    A better approach would be to find a way to agree on the *names* for
    the new characters.
    A scenario could be envisioned where an XML file (or even HTML)
    would contain the name of the font in a <FONT...> command; the
    system would read this info, load the font and extract its name
    table; and after this point, the file can contain entries like
    "&Tinco;" which the system then can display, provided there is a
    character named "Tinco" in the font, of course!
    (Note: this may not be as straightforward as it sounds. For one
    thing, the <FONT > tag has been deprecated. And the names of
    characters in TrueType fonts are PostScript names, not HTML names,
    so that a character like "periodcentered" should be addressed as
    "&middot;". But these are details, details...)

    Pim Blokland


    This archive was generated by hypermail 2.1.5 : Sun Mar 16 2003 - 08:56:24 EST