Fonts, glyphs and infinite Unicode (was String name etc).

From: Arcane Jill (
Date: Mon Apr 25 2005 - 03:32:12 CST

  • Next message: Jon Hanna: "RE: Fonts, glyphs and infinite Unicode (was String name etc)."

    -----Original Message-----
    From: []On
    Behalf Of Hans Aberg
    Sent: 23 April 2005 19:33
    To: Doug Ewell
    Cc: Unicode Mailing List
    Subject: Re: String name and Character Name

    > We are essentially back at a discussion held here sometime ago: The
    > limit of number of Unicode code points is due to design flaw in the
    > UTF-16 encoding, where the engineers who did it failed to properly
    > separate the notions of character numbers and integer-to-binary
    > encoding. If one makes that separation, it is easy to extend the
    > ranges to even to infinity, if one so likes. This can easily done
    > with UTF-8/32, and also with some effort with UTF-16. One can then
    > have sufficiently many private code points for every citizen of the
    > world to register as many characters they can by hand. Note though
    > that a requirement of supplying a glyph in a public rendering format
    > would probably diminish the number of submissions. One can also have
    > other restrictions, to exclude submissions which in some ways are not
    > considered serious. But the examples on the pages you indicate would
    > probably qualify for inclusion.
    > --
    > Hans Aberg

    I am aware that there is a problem, but I don't believe it's the one that Hans
    has identified. I'd say it was best exemplified by the following question:
    suppose I wanted to publish a web page about music, using the Unicode musical
    characters. How would I do that, exactly?

    The reality is, I simply wouldn't. I'd give up. I'd use IMAGES for my musical
    notes, not characters. And so would any serious web designer.

    Why? Because /there is no way to guarantee that viewers of my web site will
    possess the required font!/ And in fact, in this particular case, it is HIGHLY
    LIKELY that they won't.

    So ... if I can't even use all Unicode characters safely, the chance of my
    getting away with PUA characters in a web page is basically zilch.

    Hans's solution (to allow a potentially infinite number of glyphs) might solve
    this, because, if some web service existed which could translate a number into
    a glyph, that mechanism would presumably work just as well for Unicode glyphs
    as for PUA glyphs. But it is using a sledgehammer to crack a nut, and in my
    opinion goes way, way further than is needed.

    So here's what I would suggest as an alternative. This is out of the scope of
    the Unicode Consortium now, and I suspect the correct relevant body is probably
    the W3C consortium. Anyway, I reckon a small addition to CSS could solve the
    problem, something like:

    : font-url = "http://url.of.relevant.font.ttf"

    which would allow you to specify, for example, something like:

    : p { font-url = "http://url.of.musical.font.ttf" }

    to allow web pages containing musical characters to display correctly, or

    : p { font-url = "http://url.of.CSUR.font.ttf" }

    to allow web pages containing Klingon characters to display correctly, and so
    on. I'm sure those clever folk at W3C can figure out a way of having multiple
    URLs specified to allow a choice of fonts for different codepoint ranges.

    So, to summarise - I think there /is/ a problem, but I think that the solution
    lies with an extension to CSS, and corresponding changes to web browsers to
    implement that.


    PS. Thanks for the idea, Hans.

    This archive was generated by hypermail 2.1.5 : Mon Apr 25 2005 - 03:33:26 CST