Re: Emoji: emoticons vs. literacy

From: Asmus Freytag (
Date: Fri Dec 26 2008 - 21:34:54 CST

On 12/26/2008 5:55 AM, Doug Ewell wrote:
> Christopher Fynn <cfynn at gmx dot net> wrote:
>> If carriers start using Unicode instead of Shift JIS there are all
>> kinds of currently "unused" characters available for them to abuse ~
>> or they could come up with several different PUA encodings - and then
>> later come up with a proposal to standardise these using non-PUA
>> characters with the same argument of "interoperability" found in this
>> proposal.
> Isn't that exactly what happened with the current unified emoji
> repertoire? The three vendors encoded their (different) sets of
> pictures in different ranges of the Shift-JIS user-defined area, then
> looked to Unicode to unify the three sets in a common range.
One of the most important principles on which the Unicode effort was
founded was to provide a unified encoding, to finally have one single
representation for a character or symbol, instead of multiple competing
character sets, all with different codes for the same item. In providing
a unified encoding for items that exist (and are widely used) in
fragmented character sets, Unicode is fulfilling one of its core
missions. By not arbitrarily denying the needs of its users to have a
unified representation of this new phenomenon, emoji, Unicode is
redeeming a key promise to its implementers.
> I think it should be clear that there is a significant body of
> resistance to encoding these images in Unicode, although Asmus and
> Mark and Ken (among others) are on board with them and that is
> probably all it will take to get them encoded. They are a major
> compromise to the basic principles that have guided Unicode since its
> inception, in terms of what does and does not belong in a character
> encoding standard. They establish a new principle, that a group of
> 800-pound corporate gorillas can override the precedent of 15+ years
> in determining what gets encoded.
Well, I'm neither a gorilla, nor that much overweight, but thanks in
your confidence that my opinion still matters ;-)

I disagree, fundamentally, with the charge you are trying to lay the the
UTC's doorstep. I think that is not deserved. It may be based on a
misunderstanding of the fundamental nature of the Unicode project and
the revolution in character encoding it initiated.

By aiming for a universal character set, Unicode is exposed to different
constraints than designers of special-purpose character sets. In
essence, a universal character set, in order to be universal, has to
model the world (of character code usage). Unfortunately, that includes
not only the high points of writing systems for modern and classical
civilization, but also the warts. Because of that, it's impossible to be
simultaneously in full control of what's considered an encodable
character and achieve universal coverage.

If users persist to treat as characters something that you think should
not be a character, you have only two choices: extend your definition of
character, or stop being universal.
> And I really don't want to hear again that the arguments against
> encoding emoji are emotional and hysterical and opinionated, while the
> arguments in favor of emoji are based on sound, logical reasoning.
> There are facts and opinions on both sides.
I fully expect sound, logical reasoning to understand and integrate the
constraints I've outlined here. Anything that doesn't, is indeed
opinionated, and perhaps even hysterical.


PS: Before I'm misunderstood - in terms of proposed characters, there is
occasionally a third choice. A proposed character may be already
encoded, or it may be possible to represent it with character sequences.
Sometimes, it may be something that can't be a character and must be
handled elsewhere in the architecture. None of these apply in the
current discussion - the emoji are eminently supportable as characters,
as is daily proven by millions of working implementations, and they are
not duplicates or variants that can be unified with other characters.
That's why I limited my argument above to the two fundamental choices.

This archive was generated by hypermail 2.1.5 : Fri Jan 02 2009 - 15:33:07 CST