Re: Emoji: emoticons vs. literacy

From: Asmus Freytag (
Date: Tue Dec 30 2008 - 19:04:59 CST

On 12/30/2008 4:04 PM, André Szabolcs Szelp wrote:
>>> If you mean that it must be possible to indicate the meaning of something as an emoticon symbol, then I think we are back to the question whether such symbols, as independent characters and not a play on characters, are used.
>>> Shouldn't this be quite independent of the question whether they have ASCII "fallbacks" or imitations or origins? Yet we are stuck with the confused issue of "emoticons" that are ASCII strings on one side and "independent" character on the otther.
>> No, in the Unicode context, if you interpret ":)" to mean the character "smiley", then you are no longer interpreting it as two ASCII characters. For HTML to interpret > as ">" is fine, because it's a clearly defined protocol, with announcement mechanisms. For general text, the use of ":)", or worse "8)" presents an ambiguity, precisely because of the absence of clear protocol definition or announcement mechanisms.
> Well, your argumentation is flawed here. Actually, interpreting :)
> different than "colon-parenthesis" (e.g. as "smiley") does not imply
> that you are not interpreting the text as ASCII. On the contrary. When
> you interpret the substring "sh" in an English text (encoded in ASCII)
> as IPA [ʃ] rather than IPA [s.h], you are still interpreting the
> transmitted word "sunshine" in ASCII.
Not at all.

If I write ":)" and you see ":)" then both you and I interpret this as a
string of two ASCII codes. We are free to assign whatever meaning to
this we desire, from "meet me at a secret location" to "hey, I didn't
mean that too seriously" to whatever.

If I select a circle with two dots and a curved line from a list of
symbols in a chat program, and if you, on getting my message, see a
circle with two dots and a curved line, then we interpret what we sent
as the emoticon for "smiley face", and we are free to assign whatever
meaning to this we desire, from "meet me at a secret location" to "hey,
I didn't mean that too seriously" to whatever.

However, if in the second case, the my software inserted ":)" instead of
U+263A SMILING FACE and your software displayed that ":)" as if it was
U+263A, then, on the character level, the interpretation of ":" and ")"
is no longer unambiguous. Functionally, the string ":)" has turned into

If our message leaks into the general internet, where one cannot assume
the conventions of chat programs, then we now have the situation that
this ambiguity is exposed everywhere.

You may say that :) isn't a problem, because it rarely exists in other,
legitimate uses in text, but the example that started this discussion
was "8)". And here, there is a real problem.

The case of ":)" is curious. One the one hand, it's probably the the
best known punctuation sequence in that everyone knows to be an
emoticon. (Followed by :( ). And at the same time, it's already encoded
(as U+263A).

For some other emoticons ("angelface", or "innocent") the proportion of
users who could recite the correct sequence is down there with the
number of people who know that U+263A is the unicode for SMILING FACE or
that   stands for nobreak space. But everyone knows what the symbol
means when they see it, and how to select it from a list. However, if my
software sends the ASCII string for it, and your software *doesn't*
display the angel face, and if you are a typical user, you are stuck.
And there's a real problem.

In other words, using ASCII markup in plain text is a problem, whatever
your rationale for using it in the first place.


This archive was generated by hypermail 2.1.5 : Fri Jan 02 2009 - 15:33:07 CST