Re: Emoji: emoticons vs. literacy

From: vunzndi@vfemail.net
Date: Fri Jan 09 2009 - 04:18:17 CST

  • Next message: Julian Bradfield: "Re: Emoji: emoticons vs. literacy"

    Quoting "Michael D'Errico" <mike-list@pobox.com>:

    >> Your suggestion, Michael, is to modify how the Unicode standard
    >> works in order to encode emoji and similar non-text content in a
    >> flexible and extensible way. My suggestion is that this content
    >> belongs in a different standard altogether, one that is focused on
    >> non-text content.
    >
    > I've thought about this. But since you would want to intermix text
    > and non-text, it makes sense to retain Unicode as a subset and use
    > the same UTF encoding schemes. The problem, though, is that Unicode
    > claims all the code points, so a new standard would have to violate
    > the rules, either by using planes that Unicode will probably never
    > use(*), or by going beyond plane 16 (which is impossible with UTF-16
    > and specifically disallowed for UTF-8 and UTF-32 conformance).
    >
    > Personally, I would choose the latter approach and just say that you
    > can't use UTF-16. UTF-8, even limited to 4 bytes, can encode a total
    > of 32 planes, so there would be lots of initial room. Expanding it
    > to 6 bytes as it was originally specified handles 32k planes.
    >
    > The problem with moving beyond the reach of UTF-16 is that some
    > programming languages designed their String classes to hold UTF-16
    > code points, and would therefore not be able to access the non-text
    > content. This is probably the biggest roadblock to a solution
    > outside of Unicode, and means that either Unicode would have to give
    > up some of its code space to a new standard, or embrace the ideas
    > and make it a part of Unicode.
    >

    Extending beyond plane 16 would not be that difficult - but with only
    25% of the 16 planes allocated, there is no immediate danger of
    filling up all 16 planes in the near future, or even in the next few
    decades.

    >
    > Well I won't be holding my breath....
    >
    > Mike
    >
    > *Whistler's Conjecture states that no characters will ever be encoded
    > beyond plane 2.
    >

    plane 3 is now road mapped for Ideographs, and named Tertiary
    Ideographic Plane, TIP, and will certainly have characters in it.

    John Knightley

    >
    >



    This archive was generated by hypermail 2.1.5 : Fri Jan 09 2009 - 07:34:33 CST