Re: Emoji: emoticons vs. literacy

From: Asmus Freytag (
Date: Tue Dec 30 2008 - 19:10:19 CST

You need to turn emoticon display off to understand this message ;-)

I've added <digit-8, right-paren> etc. to make the meaning clear.


On 12/30/2008 3:39 PM, Asmus Freytag wrote:
> On 12/30/2008 1:15 PM, Jukka K. Korpela wrote:
>> Asmus Freytag wrote:
>>> Originally, emoticons started as a punny way of using punctuation.
>> I don’t think that’s an accurate description. Emoticons are, by their
>> nature, special symbols rather than punctuation, even when composed
>> of punctuation marks.
> Emoticons (*not* emoji) started as :) <colon, right-paren> for smiley,
> etc. These were just strings of punctuation characters selected for
> their visual similarity to a line drawing (using 90 degree rotation as
> well). That's what I mean with "punny way of using punctuation. More
> visual, but in nature not so far removed from "lol" and "AFAIK" and
> similar devices. At that level, they are clearly plain text (they were
> used in eminently plain text interchange) and correctly encoded as
> ASCII -- they were simply a form of ASCII usage.
> Only later were actual 2-D images, not requiring rotations, associated
> with these strings, and supported by application software.
>> You might compare a smiley to a question mark especially in languages
>> where question marks are the _only_ way of distinguishing a question
>> from a statement. Yet, far more often, emoticons are just something
>> supposedly funny, comparable to drawings.
> I think viewing emoticons globally as "just drawings" is not very
> helpful. Certainly the most common of them are used more like an
> extended set of punctuation symbols. (This discussion has compared
> them to cantillation marks and similar devices - all much better
> classifications than mere "drawings", but no need to repeat the
> earlier discussion)
>>> However, nowadays most users of these things pick them
>>> from a list of symbols
>> Is that the real reason for the discussion, or is the real reason
>> what John Hudson wrote: that some companies transmit emoticons as
>> characters in a nonstandard encoding?
> What John refers to are the "emoji". While the emoji contain some of
> the same symbols that are used for "emoticons", these two sets are not
> the same thing. Emoji (the Japanese set) are encoded using single
> character codes (SJIS extension). Emoticons, currently, are encoded
> using strings of mostly punctuation marks (ASCII). I'm discussing
> emoticons here.
>>> What used to be a punny way of
>>> using punctuation has become de-facto markup for text elements. Just
>>> like the TeX markup for mathematical symbols, or &gt; in HMTL.
>> No, it’s not markup. Whatever it is, it is not special notations that
>> enclose text characters. Entity references like &gt; might be called
>> markup, but they are really auxiliary notations—and ”&gt;” is
>> actually never needed, it’s used just for symmetry with ”&lt;”, which
>> is needed because ”<” as such is really markup-significant, tag start
>> character.
> That's digressing and therefore irrelevant. The current practice is
> using ":)" <colon, right-paren> or "8)" <digit-8, right-paren> nor
> ":evil:" in contexts where the sender inserts them by selecting a
> picture from a list and the receiver sees that picture inserted into
> the text stream. That makes ":)" <colon, right-paren> etc, function
> like *markup*.
>>> However, the use case here is that the
>>> display is fixed, and it's up to the user to make the distinction.
>> Once again, I don’t follow. What’s ”fixed”? There’s nothing fixed in
>> ”8)” as far as I can tell. It’s a two-character string, which has
>> many interpretations and many renderings.
> The situation I described is where the application supports the
> display of the emoticon symbol, not the ASCII string. Few users,
> seeing the symbol displayed in an inappropriate context will be able
> to "guess" what ASCII characters were really meant - to them, the
> message will be compromised. (Even if you have a handy switch to turn
> off emoticon display, I bet few users will know what to do with it - I
> also bet many of them will not know why there's suddenly a 8) in their
> text).
> That's because the current practice is no longer predominantly that of
> users typing punny punctuation strings, but that of selecting symbols
> from lists, and seeing these symbols displayed as if they were
> entities (except that they are more colorful in their rendering).
> These users no longer intend to write ASCII "8)" <digit-8,
> right-paren> they intend to write a symbol. To them "8)" <digit-8,
> right-paren> is just as much markup-gobbledigook as &gt; or &nbsp;.
> That's the use case.
>>> No, I'm not arguing for unlimited semantic encoding. Unicode's design
>>> point is that the display on the receiving end can unambiguously
>>> confer the intent of the author in terms of the identity and ordering
>>> of the written symbols.
>> Where does the Unicode Standard state this?
>> According to Wiio’s law, all communication fails, except by accident.
>> There is absolutely no way to guarantee that a string of characters
>> gets interpreted ”as intended”, or really any way to absolutely know
>> what was intended. And even more certainly, it cannot be guaranteed
>> at the level of coding characters.
> You are stumbling over "interpretation" here. That word is used in a
> funny way in the Unicode standard. It does not refer to interpreting
> the whole of the text, but interpreting something as a character. And
> Unicode is indeed about making sure that sender and receiver interpret
> codes as characters in the same way. Wiio's law is beside the point here.
>> If you mean that it must be possible to indicate the meaning of
>> something as an emoticon symbol, then I think we are back to the
>> question whether such symbols, as independent characters and not a
>> play on characters, are used.
>> Shouldn’t this be quite independent of the question whether they have
>> ASCII ”fallbacks” or imitations or origins? Yet we are stuck with the
>> confused issue of ”emoticons” that are ASCII strings on one side and
>> ”independent” character on the otther.
> No, in the Unicode context, if you interpret ":)" to mean the
> character "smiley", then you are no longer interpreting it as two
> ASCII characters. For HTML to interpret &gt; as ">" is fine, because
> it's a clearly defined protocol, with announcement mechanisms. For
> general text, the use of ":)", or worse "8)" presents an ambiguity,
> precisely because of the absence of clear protocol definition or
> announcement mechanisms.
> If you used U+263A instead of ":)" then your text would no longer be
> ambiguous on the character level.
> Encoding additional emoticons would have the potential benefit of
> allowing users (and applications) to sidestep the ambiguous ASCII
> markup. The potential benefit would be most felt where emoticons are
> part of the most commonly used subset, and where their ASCII markup is
> the most prone to misinterpretation or confusion with regular text
> (e.g. "8)" <digit-8, right-paren> or "B)" <upper-B, right-paren> or
> similar).
> A./

This archive was generated by hypermail 2.1.5 : Fri Jan 02 2009 - 15:33:07 CST