From: Asmus Freytag (asmusf@ix.netcom.com)
Date: Tue Dec 30 2008 - 19:10:19 CST
You need to turn emoticon display off to understand this message ;-)
I've added <digit-8, right-paren> etc. to make the meaning clear.
A./
On 12/30/2008 3:39 PM, Asmus Freytag wrote:
> On 12/30/2008 1:15 PM, Jukka K. Korpela wrote:
>> Asmus Freytag wrote:
>>
>>> Originally, emoticons started as a punny way of using punctuation.
>>
>> I don’t think that’s an accurate description. Emoticons are, by their 
>> nature, special symbols rather than punctuation, even when composed 
>> of punctuation marks. 
> Emoticons (*not* emoji) started as :) <colon, right-paren> for smiley, 
> etc. These were just strings of punctuation characters selected for 
> their visual similarity to a line drawing (using 90 degree rotation as 
> well). That's what I mean with "punny way of using punctuation. More 
> visual, but in nature not so far removed from "lol" and "AFAIK" and 
> similar devices. At that level, they are clearly plain text (they were 
> used in eminently plain text interchange) and correctly encoded as 
> ASCII -- they were simply a form of ASCII usage.
>
> Only later were actual 2-D images, not requiring rotations, associated 
> with these strings, and supported by application software.
>> You might compare a smiley to a question mark especially in languages 
>> where question marks are the _only_ way of distinguishing a question 
>> from a statement. Yet, far more often, emoticons are just something 
>> supposedly funny, comparable to drawings.
> I think viewing emoticons globally as "just drawings" is not very 
> helpful. Certainly the most common of them are used more like an 
> extended set of punctuation symbols. (This discussion has compared 
> them to cantillation marks and similar devices - all much better 
> classifications than mere "drawings", but no need to repeat the 
> earlier discussion)
>>
>>> However, nowadays most users of these things pick them
>>> from a list of symbols
>>
>> Is that the real reason for the discussion, or is the real reason 
>> what John Hudson wrote: that some companies transmit emoticons as 
>> characters in a nonstandard encoding?
> What John refers to are the "emoji". While the emoji contain some of 
> the same symbols that are used for "emoticons", these two sets are not 
> the same thing. Emoji (the Japanese set) are encoded using single 
> character codes (SJIS extension). Emoticons, currently, are encoded 
> using strings of mostly punctuation marks (ASCII). I'm discussing 
> emoticons here.
>>
>>> What used to be a punny way of
>>> using punctuation has become de-facto markup for text elements. Just
>>> like the TeX markup for mathematical symbols, or > in HMTL.
>>
>> No, it’s not markup. Whatever it is, it is not special notations that 
>> enclose text characters. Entity references like > might be called 
>> markup, but they are really auxiliary notations—and ”>” is 
>> actually never needed, it’s used just for symmetry with ”<”, which 
>> is needed because ”<” as such is really markup-significant, tag start 
>> character.
> That's digressing and therefore irrelevant. The current practice is 
> using ":)" <colon, right-paren> or "8)" <digit-8, right-paren> nor 
> ":evil:" in contexts where the sender inserts them by selecting a 
> picture from a list and the receiver sees that picture inserted into 
> the text stream. That makes ":)" <colon, right-paren> etc, function 
> like *markup*.
>>
>>> However, the use case here is that the
>>> display is fixed, and it's up to the user to make the distinction.
>>
>> Once again, I don’t follow. What’s ”fixed”? There’s nothing fixed in 
>> ”8)” as far as I can tell. It’s a two-character string, which has 
>> many interpretations and many renderings.
> The situation I described is where the application supports the 
> display of the emoticon symbol, not the ASCII string. Few users, 
> seeing the symbol displayed in an inappropriate context will be able 
> to "guess" what ASCII characters were really meant - to them, the 
> message will be compromised. (Even if you have a handy switch to turn 
> off emoticon display, I bet few users will know what to do with it - I 
> also bet many of them will not know why there's suddenly a 8) in their 
> text).
>
> That's because the current practice is no longer predominantly that of 
> users typing punny punctuation strings, but that of selecting symbols 
> from lists, and seeing these symbols displayed as if they were 
> entities (except that they are more colorful in their rendering). 
> These users no longer intend to write ASCII "8)" <digit-8, 
> right-paren> they intend to write a symbol. To them "8)" <digit-8, 
> right-paren> is just as much markup-gobbledigook as > or  . 
> That's the use case.
>>
>>> No, I'm not arguing for unlimited semantic encoding. Unicode's design
>>> point is that the display on the receiving end can unambiguously
>>> confer the intent of the author in terms of the identity and ordering
>>> of the written symbols.
>>
>> Where does the Unicode Standard state this?
>>
>> According to Wiio’s law, all communication fails, except by accident. 
>> There is absolutely no way to guarantee that a string of characters 
>> gets interpreted ”as intended”, or really any way to absolutely know 
>> what was intended. And even more certainly, it cannot be guaranteed 
>> at the level of coding characters.
> You are stumbling over "interpretation" here. That word is used in a 
> funny way in the Unicode standard. It does not refer to interpreting 
> the whole of the text, but interpreting something as a character. And 
> Unicode is indeed about making sure that sender and receiver interpret 
> codes as characters in the same way. Wiio's law is beside the point here.
>>
>> If you mean that it must be possible to indicate the meaning of 
>> something as an emoticon symbol, then I think we are back to the 
>> question whether such symbols, as independent characters and not a 
>> play on characters, are used.
>>
>> Shouldn’t this be quite independent of the question whether they have 
>> ASCII ”fallbacks” or imitations or origins? Yet we are stuck with the 
>> confused issue of ”emoticons” that are ASCII strings on one side and 
>> ”independent” character on the otther.
>>
> No, in the Unicode context, if you interpret ":)" to mean the 
> character "smiley", then you are no longer interpreting it as two 
> ASCII characters. For HTML to interpret > as ">" is fine, because 
> it's a clearly defined protocol, with announcement mechanisms. For 
> general text, the use of ":)", or worse "8)" presents an ambiguity, 
> precisely because of the absence of clear protocol definition or 
> announcement mechanisms.
>
> If you used U+263A instead of ":)" then your text would no longer be 
> ambiguous on the character level.
>
> Encoding additional emoticons would have the potential benefit of 
> allowing users (and applications) to sidestep the ambiguous ASCII 
> markup. The potential benefit would be most felt where emoticons are 
> part of the most commonly used subset, and where their ASCII markup is 
> the most prone to misinterpretation or confusion with regular text 
> (e.g. "8)" <digit-8, right-paren> or "B)" <upper-B, right-paren> or 
> similar).
>
> A./
>
>
>
>
This archive was generated by hypermail 2.1.5 : Fri Jan 02 2009 - 15:33:07 CST