Re: Emoji: emoticons vs. literacy

From: Asmus Freytag (
Date: Sat Jan 03 2009 - 17:37:01 CST

  • Next message: James Kass: "Re: Emoji: emoticons vs. literacy"

    On 1/3/2009 2:23 PM, Doug Ewell wrote:
    > Asmus Freytag <asmusf at ix dot netcom dot com> wrote:
    >> Unicode was definitely designed in firm opposition to ISO 2022 as
    >> well as ISO 10646 DIS-1 which all use(d) stateful controls to achieve
    >> code-set switching. Trying to reintroduce this, for example for
    >> private use set switching, is to take aim at one of the core design
    >> goals for the standard.
    > I contend that encoding fad symbols with only a very short history of
    > use, in a niche environment, some of which depend on color and
    > animation for their identity, takes aim at several different core
    > design goals.
    >>> (I almost wrote "chirping." Are audio-enabled characters on the
    >>> horizon?)
    >> Been there, done that: U+0007, the control code to ring the BELL on
    >> your terminal predates Unicode by decades!
    > Nice. But TUS 5.0 doesn't list U+0007 in Table 16-1 (p. 533) as one
    > of the "Control Codes Specified in the Unicode Standard." Neither is
    > it treated in any special control-code way in SCSU (UTS #6), unlike
    > NUL and CR and LF. The implication is that UTC doesn't consider BEL
    > as a control code whose use is "widespread and important to
    > interoperability," whereas the whole point of encoding emoji seems to
    > be interoperability.
    You are ignoring history, again. U+0007 is indeed coded as BELL in
    Unicode 1.0. In the subsequent merger with 10646 a number of compromises
    were made by both committees to be able to move forward. WG2 gave up on
    the idea of reserving control *bytes* in the middle of multi-byte
    characters (e.g. 0x0750 would not have been a legal character code, nor
    0x5007), and, as part of a package of similar compromises, Unicode was
    willing to tolerate weakening the definition of the control characters
    to be generic in principle, but considered defined by ISO 6429 by default.

    Control codes were definitely considered "widespread and important to
    interoperability", and having dedicated code points allowed translating
    terminal data streams to Unicode, or to translate such data streams
    between character sets, using Unicode as the pivot, and so on and on.

    Later, the fact that NUL, TAB, CR, LF and to a lesser extent NEL and VT
    are really not so much used as device controls, but act like *format
    characters* lead to their recognition in several algorithms. Recently,
    the mood in WG2 is tending towards more explicit documentation of these
    characters, which you could interpret, in a way, as a vindication of the
    original approach by Unicode 1.0.
    >> PS: I'm content to allow the proposers to suggest the use of
    >> 'cartoon-style' drawings that suggest movement by graphical means. If
    >> that's what they end up proposing and if that is satisfactory for
    >> interoperability purposes, then I have no further problem. (I also
    >> would not have a problem if user agents that display yellow emoticons
    >> were to give a colorful rendition of U+263A).
    > I don't mind at all if characters *can* be colored and animated, by
    > any vendor who chooses to do so. I do mind if they *must* be colored
    > and/or animated in order to retain their identity, or be
    > distinguishable from other such characters.
    Unicode: all the characters as long as they are black and white. ?

    Actually, as long as color support is not universal (and even 'modern'
    inventions like the Kindle can't handle color, requiring a b/w fallback
    is reasonable, and then you can formally encode that, using hatching or
    whatever graphical means. But, given a well-thought out fallback, I
    don't mind the least if the character name is based on the color that
    would be desired for full rendering.


    This archive was generated by hypermail 2.1.5 : Sat Jan 03 2009 - 17:39:40 CST