Re: more dingbats in plain text

From: Asmus Freytag (
Date: Fri Apr 17 2009 - 22:26:38 CDT

  • Next message: Mark Davis: "Re: Handling of Surrogates"

    On 4/17/2009 7:59 PM, Doug Ewell wrote:
    > Asmus Freytag <asmusf at ix dot netcom dot com> wrote:
    >> If that kind of thing amuses you, try reading the introduction to the
    >> Unicode Standard. The early versions boldly proclaim many things off
    >> limits that later happened. From 32-bit character codes to Musical
    >> symbols.
    > That's true; ...[t]he world has changed quite a bit in 21 years. I
    > mentioned that the Principles and Procedures document was updated less
    > than a year ago. This is not some relic of Dr. Becker's original
    > vision that has proved impractical in the modern world; it was
    > reissued in May 2008. After 20 years of Unicode, it seems unlikely
    > that there was some gross lack of foresight in May 2008 concerning the
    > type of symbols that should or should not be encoded, which suddenly
    > came clear in December 2008.
    You'd be surprised.
    (The document gets re-issued constantly, by the way. Be aware that it
    addresses many topics, not only symbols, and there are many updates that
    affect only one or two specific items in the text).
    >>> Unless they can be defined as "compatibility characters," in which
    >>> case all of them must be encoded without question.
    >> The "sets of symbols" I was addressing in that part of my message,
    >> however, did not include compatibility character sets, but sets
    >> organized by category or type of symbol, like ISO safety symbols, UI
    >> symbols, etc.
    > If the set of symbols is captured by glyphs in a font, though, that
    > might qualify as a compatibility character set.
    Generally not. There are too many of them. You need some other
    considerations. For example, a nearly universally available and
    accessible set, consistently mapped to a specific font. That's getting
    > It's been pointed out that the Zapf Dingbats got in by virtue of being
    > encoded in the repertoire of contemporary laser printers. That wasn't
    > a "character set" in the sense of ISO 8859 or Big5 or Shift-JIS.
    The latter were "file system" character sets, and later UI character
    sets - i.e. directly supported by an OS.

    There were other character sets that were supported *as character sets*
    by devices and early applications. The early coding community included
    people from companies that were building such applications or devices.
    > The Wingdings and Webdings family of fonts, distributed with every
    > copy of Windows for over a decade, absolutely qualify as
    > "compatibility character sets" according to the guidelines being
    > applied for the emoji.
    That is your position. I think it is not without merit, by the way, but
    the decision whether to accept your reasoning rests with the UTC and
    WG2. That won't (or can't) happen until you submit a proposal to encode
    these characters which provides your rationale for making that choice.


    PS: By the way, a good portion of these sets (W***dings) is already
    covered, and of the remaining characters, a good number would also
    qualify individually under the criteria I proposed for _common symbols_.
    The proportion of pure compatibility characters is therefore much less
    than the whole.

    This archive was generated by hypermail 2.1.5 : Fri Apr 17 2009 - 22:28:12 CDT