Re: Emoji: Public Review December 2008

From: Doug Ewell (
Date: Sat Dec 20 2008 - 23:19:50 CST

Asmus Freytag <asmusf at ix dot netcom dot com> wrote:

> I'm surprised at how much of this discussion appears to be driven by
> prior conviction and how many of the arguments that are being made
> seem to become emotional. Many contributors seem to base their input
> purely on a value judgment of what they deem appropriate types of
> text.

My input has been based on:

(a) where the WG2 "Principles and Procedures" document (N3452) says the
line should be drawn,

(b) where the Unicode Consortium and WG2 have drawn the line for the
past 15 years, and

(c) what the most respected authorities within UTC, including Asmus,
have said for the past 10-plus years about where the line should be

> Architecturally, Unicode is designed to address plain text. Over time,
> the shared understanding of what is plain text has evolved - starting
> initially from the type of plain text seen in plain text environment
> such as old-style e-mail, for example, and later being expanded to
> encompass codes for the underlying text entities in markup languages,
> even if they aren't fully usable outside of such protocols. The sets
> of symbols for musical and mathematical notation contain quite a
> number of characters that are only fully functional when used with a
> full music composition system or mathematical layout (such as MathML).

BLOOD TYPE A and ROASTED SWEET POTATO and POOP fit into the modern
shared understanding of what is plain text.

I grant that there are many symbols in this collection that are
communicative in nature, and would fit comfortably within Unicode.
There are many others that are not, and would not. The decision to
encode the entire set, the inappropriate ones as well as the
appropriate, appears from here to have been based on the clout and
prestige of the requesters, not the appropriateness of the symbols.

> That emoji act functionally like plain text elements the way that they
> fit into the architecture of numerous existing implementations and
> that they are interchanged - about these facts there can be no
> reasonable disagreement.

Japanese cell-phone vendors are using these symbols as plain text
characters. About this fact, there can be no reasonable disagreement.
As to whether a symbol like ROASTED SWEET POTATO carries any
communicative value, beyond being a picture of a roasted sweet potato,
there can be plenty of disagreement.

N3452 specifically mentions "pictures of cows" and "stop sign" as
examples of symbols that should not be encoded. Naturally it is a bit
of a surprise to see so much official and expert support behind the
encoding of COW and TRAFFIC LIGHT.

> Pretending otherwise does not speak from the observable facts, but
> rather appears based on prior convictions and value judgments of a
> sort, which, I believe, have no place in the development the Unicode
> Standard.

This statement is highly subjective. I'm reading N3452. If that
represents prior convictions and value judgments, then they are not
mine, but those of WG2 -- which is admittedly not UTC. (I suppose it
would be interesting to know what the official WG2 position is on emoji,
and how WG2 reconciles support for emoji with support for N3452.)

> Suggestions like endorsing permanent private use code assignments or
> inventing special, stateful, mini-markup for these characters, are
> likewise driven by the desire to express a value judgment, and not by
> careful analysis of the technical requirements. Some of these
> suggestions were made by people whose sound technical judgments I had
> come to trust. I will have to be more careful in the future: these
> suggestions, if acted on, would do more harm to the Unicode Standard
> than admitting even an unexamined set of symbol characters.

Well, thank goodness I never suggested any of these.

> What is needed most, at this juncture, is not further opinionizing
> about the value of these proposed characters, but the detailed work of
> sorting them into the standard. There are enough hard questions to be
> answered:

So, in other words, the decision to encode the entire set has been made,
and resistance is futile.

> 1) Are there entities that can't be encoded for exceptional reasons?
> 2) What are the semantic distinctions and range of semantics to be
> encoded?
> 3) What to do about semantic distinctions normally represented by text
> styles?
> 4) What to do about naming?
> ...
> To 3: Mathematical symbols assign semantic value to what would be
> stylistic variation in other context. In principle, the same could be
> applied to color distinctions for emoji. Some emoji codes would
> require color for correct rendering, but color would otherwise remain
> limited to markup. Such a solution would be entirely parallel to what
> was done for mathematical alphabets -- however -- if there's a
> possible mapping to a range of textures (black, white, lined, hatched)
> that would be an acceptable way of handling the situation, so as to be
> able to sidestep this issue entirely for now, and perhaps for ever.
> ...
> Having said all this, why can't I find more of a discussion of
> individual characters from the proposal, e.g. in the light of the four
> questions I outlined above?

Fine, here's a question related to item #3, and the individual
characters to which item #3 relates:

If the proposal is being made to establish some sort of cross-hatching
scheme to represent colored images, similar to that used for heraldic
tinctures, then what sort of scheme shall be used for animated images?
Several of the proposed images, especially those present only in the
KDDI and SoftBank collections, are attested only as animations. How,
for example, are we supposed to distinguish between CHICK and HATCHING
CHICK unless our fonts and rendering engines (or printed pages) support

Doug Ewell  *  Thornton, Colorado, USA  *  RFC 4645  *  UTN #14  ˆ

This archive was generated by hypermail 2.1.5 : Fri Jan 02 2009 - 15:33:07 CST