Re: Emoji: emoticons vs. literacy

From: James Kass (
Date: Wed Jan 07 2009 - 10:33:47 CST

  • Next message: Johannes Rössel: "Re: UTF-8 and string manipulations in Java"

    You cover many important points. It's true that pragmatic
    interopability was a cornerstone Unicode design goal. Unicode
    achieved that goal by offering the two sources you mention,
    government and industry, the incentive to become interoperable
    via 1:1 encodings. That was pragmatic, indeed. These are the
    legacy encodings.

    Many of the characters in the legacy encodings were no problem
    at all, but others might have been thought a bit funny by some.
    Didn't matter, in they went. Legacy. Pragmatic interoperability.
    Very practical and effective.

    Some of those funny legacy characters wouldn't make it past the
    door, if they tried crashing our party today, we're told. They
    wouldn't meet our principles. Legacy -- backwards compatibility
    with other official (government, national or international) and
    quasi-official (industry, also known as the manufacturers/vendors
    of computer devices) pre-existing standards.

    Unicode successfully initiated a fairly consistent architecture
    guaranteeing interoperability among those mutually overlapping
    legacy character sets. Unicode has even gone beyond those initial
    goals. Characters have been added enabling diverse user communities
    to represent their own scripts, own languages, and own identity
    in computer plain-text. Unicode has gone beyond mere
    interoperability between old-fashioned, legacy characters sets --
    Unicode is now *the* standard for computer plain-text. People
    exchange Unicode data all the time, without any need to convert
    its encoding or even having to worry about encoding at all.
    Interoperability means it just plain works.

    I tend to agree with the first parts of Mark's letter. The reason
    all those things sound so grand is that they *are* grand.

    And then there's those last two lines. There don't appear to
    be pragmatic interoperability issues driving Unicode's push
    to encode these emoji. Instead, the vendors (not the same
    group of computer manufacturers busily drafting all of
    those legacy character sets we had to encode) have already
    solved their internal problems with respect to interoperability.
    They've got a plethora of PUA in which to either expland their
    individual icon sets mutually exclusively, or they can get their
    respective acts together and work things out in the PUA
    consistently. Unicode plain-text fulfills its goal of interoperability,
    in the unlikely or unwelcome event that private messages between
    cell phone users in Japan are getting sucked into somewhere they
    shouldn't be, by guaranteeing that those private use Unicode
    characters don't get munged somewhere along the way. Those PUA
    characters can be studied, stored, sorted, sent, indexed, and even
    displayed consistently, as long as someone takes the trouble to do
    so. If they go somewhere they shouldn't be going, and some app.
    wants to do something with them, but can't process anything it
    can't interpret, that should be just fine. If the programmer wants
    to get information about specific PUA use and is unable to do so,
    that's the way it goes. Probably wasn't anybody's business, anyway.

    The information about cell phone vendors' PUA use is available to
    search engine companies, they can use it as they please. Maybe
    as the cell phone vendors add yet more cartoons and pictograms
    to their sets, and other cell phone companies along with anyone
    else using any kind of icon contribute *their* sets -- they'll be
    kind enough to keep their internal PUA charts updated and available
    for anyone who might want them.

    Then there's the phrase about "having the emoji symbols encoded"
    in computer plain-text "will be far more useful to many more
    people than" Phaistos. Putting Phaistos aside for a moment, who
    are these far, far more people who will be deriving benefit? Those
    millions of Japanese cell phone users? No, they've already got
    interoperability between themselves, and can expand, contract,
    redraw, reassign and redistribute anything in the PUA they want.

    Who is it, then, that benefits? Is it the potential future customers
    and existing customer base of other cell phone vendors world-wide?
    No, they'll surely end up just adding their stuff to the PUA, too.
    That way, *they* control it. As it should be.

    What are these benefits, who is going to get them, and how much
    serious attention is given to alternatives?

    Thank you for not mentioning "compatibility" characters in your

    What else. Ah, Phaistos. My opinion, for what it's worth, is that
    it's OK to leave Phaistos in.

    Best regards,

    James Kass

    -----Original Message-----
    >From: Mark Davis <>
    >Sent: Jan 6, 2009 5:23 PM
    >To: Asmus Freytag <>
    >Cc: James Kass <>, Peter Constable <>, "" <>
    >Subject: Re: Emoji: emoticons vs. literacy
    >I'll second Asmus.
    >We cannot forget that when we set out to design Unicode, pragmatic
    >interoperability was a key goal. We never had a "pure" standard, divorced
    >from such concerns.
    >(I hear various comments about "industrial Goliaths" on these threads -- but
    >without the support and active involvement of industry and governmental
    >organizations, Unicode would have been an interesting academic exercise, but
    >no more than an academic exercise. And if anyone on this list is interested
    >in academic exercises, they are free to start their own.)
    >Instead, our goal was to produce a standard that would allow us to have as
    >consistent an architecture as possible -- to enable effective and efficient
    >implementations -- *and* to interoperate well with a host of standards and
    >practices in widespread use: national, international, and vendor. The
    >Unicode Standard has added misc symbols many times before: Dingbats, Misc
    >technical symbols, ARIB symbols, etc., just for that reason.
    >Of course the scope of Unicode changed over time. Initially, for example, we
    >were not really aiming at encoding archaic scripts. I think at one time we
    >had excluded encoding Braille as well. But it continues to be driven by
    >pragmatic concerns of interoperability. And having the emoji symbols encoded
    >will be far more useful to many more people than, say, the Phaistos disk

    This archive was generated by hypermail 2.1.5 : Wed Jan 07 2009 - 10:35:56 CST