From: Jukka K. Korpela (email@example.com)
Date: Sat Dec 27 2008 - 04:59:33 CST
Asmus Freytag wrote:
> One of the most important principles on which the Unicode effort was
> founded was to provide a unified encoding, to finally have one single
> representation for a character or symbol, [...]
> By not arbitrarily denying the needs of its users to
> have a unified representation of this new phenomenon, emoji, Unicode
> is redeeming a key promise to its implementers.
When you put it that way, it sounds very convincing. And if emoticon
characters of all kinds, except those already encoded, are placed in a plane
of their own, as suggested in the discussion, then there are hardly any
Yet, I'm not quite convinced.
> In essence, a universal character set, in order to be
> universal, has to model the world (of character code usage).
Yes, but we still have the issue of modeling the real world as opposite
coping with imaginary or all possible worlds.
Please correct if I'm wrong, preferably _proving_ I'm wrong, but the
so-called emoticons originated as strings of Ascii characters. The first
emoticon was ":-)", and you were supposed to turn your head 45 degrees to
the left to see it as image-like, and the joy of understanding this was part
of the idea. Then the idea was imitated in different ways, in potentially
infinite number of ways. There was nothing in this process that was of
concern to character encoding. Of course there were more opportunities for
fun when not limited to Ascii, but encoded characters were still just
building boxes in this game.
Later, some program developers started to implement automatic conversion of
recognized emoticons to small images. So when the user of a chat system, for
example, typed ":-)", it was displayed as a small, often colored image
(icon) of a smiling face. There's nothing here that requires any action in
Now, such images, or the strings used to "trigger" them, _might_ be treated
as objects with some identity of their own, behind possible variation in
shape, colors, and all that. Moreover, these objects _might_ be recognized
as characters - to be encoded. I think the issue is whether this is really
taking place in the real world, as opposite to people's thinking of what
might happen, and whether such objects will really be used as characters.
Does anyone even _want_ that?
What tangible benefits would there be to anyone? I suppose people would
still input "emoticon characters" as strings of characters that can be
conveniently typed on common keyboards, mostly Ascii characters. They would
be turned to images by some software and left intact by other software.
Encoding as characters would add the extra option that emoticon-aware
software could convert them to characters. Why? Just because it _can_ be
done? I think this would be the only "benefit". Instead of passing data in a
format that can be treated as strings of Ascii characters, which may or may
be mapped to images by some rules, you would readily map them to Unicode
characters that are new to the world, with no support in any currently
existing font, and with maybe some support in some fonts within a decade or
so - and unknown to all existing software, etc.
-- Yucca, http://www.cs.tut.fi/~jkorpela/
This archive was generated by hypermail 2.1.5 : Fri Jan 02 2009 - 15:33:07 CST