From: Doug Ewell (doug@ewellic.org)
Date: Sat Jan 03 2009 - 16:52:12 CST
John Hudson <john at tiro dot ca> wrote:
> I don't think PUA characters should be used to encode emoji any more 
> than I think standardised Unicode characters should be used to encode 
> emoji. It seems to me that we're looking at encoding as textual 
> characters things that in important respects do not behave like 
> textual characters only because someone else has has treated them as 
> textual characters (for the purpose of telecom transmission).
I agree 100% with what John says, notwithstanding my earlier post that 
the use of PUA characters is not evil in the abstract.  In fact, 
supporters of emoji have even stated that the reason UTC is obliged to 
encode them is that their hands are tied by the Japanese cell phone 
vendors already having done so.
> Since, other than transmission, emoji do not behave like other text --  
> they are not supported by normal text layout and font interaction, but 
> as inline graphics --, it seems to me that what we're looking at is 
> not character encoding as we typically understand it but transmission 
> code standardisation. What the telecom companies need is a reliable 
> way for one device to tell another device that emoji graphic X should 
> be displayed; i.e. they need to send some kind of identifier from one 
> device to another.
In fact, there are already technologies for representing non-text 
objects in a plain text stream.  One of them is called SGML, and it 
served as the foundation for others called HTML and XML.  There are 
rumors that some people have learned to use these formats in certain 
applications.
In all seriousness, John's suggestion of "some kind of identifier" is 
not only a better solution for Unicode, but for the cell phone vendors 
as well.  They could define a new 0x1B escape code in the ETSI character 
set (see http://www.unicode.org/Public/MAPPINGS/ETSI/GSM0338.TXT for 
details) to signify that an emoji index (possibly numeric, possibly 
symbolic) follows.
With an open-ended mechanism like this, they could expand their emoji 
repertoires easily and almost limitlessly.  By registering new emoji 
virtually on demand, perhaps even with direct customer involvement, they 
would be in a perfect position to satisfy the major stated needs of 
their customer base.
> They have been using character codes because it seems convenient, but 
> that doesn't imply that this is the only or best method, and it 
> certainly doesn't imply that everything that gets transmitted as text 
> is text or is suitable content for a text encoding standard. I might 
> as easily use a character code as a trigger to play a sound file as to 
> display an inline graphic; that doesn't make the sound file a 
> character.
Perfectly correct, and perfectly stated.
-- Doug Ewell * Thornton, Colorado, USA * RFC 4645 * UTN #14 http://www.ewellic.org http://www1.ietf.org/html.charters/ltru-charter.html http://www.alvestrand.no/mailman/listinfo/ietf-languages ˆ
This archive was generated by hypermail 2.1.5 : Sat Jan 03 2009 - 16:54:15 CST