Re: Custom fonts (was: Tolkien wanta-be)

From: Doug Ewell (
Date: Sat Mar 15 2003 - 18:09:53 EST

  • Next message: Roozbeh Pournader: "Re: Surrogate supported in Mozilla 1.3"

    Chris Jacobs <c dot t dot m dot jacobs at hccnet dot nl> wrote:

    > I think it is more important that there be a mechanism to relocate a
    > font when you install it.
    > Font developers should be free to develop fonts without having to
    > decide where in the end-user's pua it is supposed to be.
    > My pua, the pua of font developer A, and the pua of font developer B
    > should be considered three different spaces, so that there be no
    > conflict when developer A and B put different chars on the same code
    > point.
    > And if it works that way I see no end-user problems if a font
    > developer has his pua on a place where it not belongs.

    The analogy to relocatable program code is intriguing. But as Pim
    responded, the whole purpose of data interchange would be defeated if
    end users could do this.

    Suppose James Ruddy goes ahead and writes his novel in his invented
    language, using an invented script I'll call "Ruddian." Suppose he uses
    Private Character Editor to create his glyphs, and (for whatever reason)
    picks a range beginning at U+E770. Say, for instance, that RUDDIAN
    LETTER KA ends up being assigned to U+E773.

    If James creates or commissions a font that includes Ruddian characters
    in this range, he can make it available to end users so they can read
    his novel on the Web.

    But when James wants to represent RUDDIAN LETTER KA on his Web page,
    he's going to use U+E773. Any other code point will not be KA; it will
    be something else (possibly a .notdef glyph). If the end user has
    invoked some sort of relocation command to move KA from U+E773 to
    somewhere else, he won't be able to read James's text. Only U+E773 will
    do. That's why relocatable PUAs won't work; they destroy data

    Now, as Chris points out, "developer A" (in this case, the ConScript
    Unicode Registry) and "developer B" (William Overington) have already
    assigned other characters to U+E773, in their own PUA implementations.
    There's already a conflict between A and B, and James (as developer C)
    would only be adding to the apparent chaos. And we don't know how many
    other individuals and companies might have put something else at U+E773
    in various fonts and registries.

    That's why it's the PRIVATE use area -- each user gets to choose how the
    code space is divided up for his own personal use. (It doesn't mean the
    assignments have to be kept secret!) The CSUR assignments can be posted
    on the Web, as can William's "Golden Ligature Collection," and other
    developers and end users are free to implement those assignments. Or
    they can ignore them and create brand-new assignments for the same code
    space, as James did in this example. Either approach is totally

    Here's an instance of U+E773: []
    What do you see?

    (d) a black box, geta mark, or other .notdef glyph
    (e) anything else imaginable

    Right now, if you see (a), (b), (d), or even (e) you are totally
    conformant, and both my sending system and your receiving system are
    working correctly. If James makes a Ruddian font available as described
    above, then (c) would be OK as well. If your interpretation of U+E773
    doesn't match mine, oh well. That's life in the Wild Wild PUA.

    So what's the solution? How do we prevent this sort of chaos from
    destroying data interchange?

    The answer is that any text interchange involving the PUA must depend on
    an *agreement* between sender and receiver as to which PUA convention is
    being used. In other words, if I send  you should have some way of
    knowing whether I meant it to be a Solresol syllable, a Golden Ligature,
    or something else. There might be an implicit understanding that all
    PUA characters on a Web site are encoded according to (e.g.) ConScript,
    or it might be explicitly stated at the start of a file or page.

    The sample texts on my Web site written in my invented script (see, for example) are not
    exemplary, since they don't say explicitly that the ConScript encoding
    is used. I should probably include at least an HTML comment to that

    BTW, another problem with "relocatability" is that it would force end
    users, who generally don't know squat about the PUA, to make a technical
    decision regarding it.

    -Doug Ewell
     Fullerton, California

    This archive was generated by hypermail 2.1.5 : Sat Mar 15 2003 - 18:58:28 EST