RE: Emoji: emoticons vs. literacy

From: James Kass (
Date: Sat Jan 03 2009 - 00:00:39 CST

  • Next message: Michael Everson: "Re: Emoji: emoticons vs. literacy"

    Peter Constable replied,

    >>> These are getting interchanged publicly between
    >>> different vendors products. That's not private use.
    >> Semantics. There is no point to user defined characters if
    >> they can't be exchanged. There is even at least one well-known
    >> PUA registry.
    > I don't mean just communicated between different vendors'
    > processes, but also interpreted and processed by different
    > vendors' processes, in contexts where no private agreement
    > can be assumed.

    The existence of a private agreement is a given, otherwise
    neither interpretation nor processing would be desired. In
    contexts where the nature of the private agreement cannot
    be determined, no interpretation is possible. Processing can
    be done on uninterpreted strings. I don't need to be able to
    speak Hindi in order to enter, store, search, and collate text
    written in Devanagari, and neither does my plain-text editor.

    Success in interpreting the text, then, lies in determining the
    nature of the private agreement. This is not a new concept,
    it has been discussed here previously, unless I'm mistaken.
    Mark-up was one method mentioned, if I recall correctly.
    Search engines can interpret mark-up.

    > If text content is getting generated in (say)
    > DoCoMo text protocols, spreading into other content via
    > other protocols and then that content is getting interpreted
    > by processes produced by Google or Apple or whomever,
    > than the sense in UTC (I think I can say) is going to be that
    > that is *public* interchange, hence presenting a case for
    > being representable in the UCS.

    Public interchange of private characters, which happens all
    the time, is a good indicator that a case might be made for
    plain-text encoding. Suitability again, opinions may vary,
    members vote. (I'm trying to rephrase and expand on what
    you said to see if we're basically agreeing here.)

    >> Quite so. Refusing to encode these would be the best
    >> tactic to keep others from using the PUA to "promote"
    >> their thingies into regular Unicode.
    > By that line of argumentation, we could completely
    > freeze encoding of any new characters as a tactic to
    > keep others from inventing new characters that might
    > need to be encoded -- sure, we could do that; but that
    > doesn't mean we *should* on that basis.

    We shouldn't exclude text-like characters from being included
    in a plain-text encoding standard as long as all the criteria are
    met. "Thingies" might have been a poor choice of words on
    my part. To rephrase, refusing to encode this set of proprietary
    random icons en-masse would prevent others from trying to
    get their icon sets (or whatever) promoted.

    Not to say that some of the underlying symbols which some
    of the icons represent shouldn't be encoded, many of them
    already are. Michael Everson pointed out some which weren't
    and probably should be the last time we went around on this.
    The remainder may be rejected for unsuitability after careful
    study. The ones which might get newly encoded as plain text
    characters should *be* plain text characters.

    The vendors who invented this icon set should continue to use
    the PUA to exchange them. They are icons/signage and are
    being exchanged and interpreted by humans as icons/signage.
    Any machine interpretation of them should emulate what
    people are doing. It's OK for there to be some overlap between
    icons/signage and plain-text characters, after all, many of
    those icons are pictures of those characters.

    Standardizing an icon set in plain-text opens a door best
    left closed.

    Establishing a method to identify PUA schemes would enable
    interpretation by any process which does that sort of thing
    for much, much more than the emoji icon set.

    (Of course, there is already a mark-up solution in place. Hint
    to search engines everywhere desiring interpretation of PUA
    code points: check the font(s) specified in the mark-up.)

    Best regards,

    James Kass

    This archive was generated by hypermail 2.1.5 : Sat Jan 03 2009 - 00:04:25 CST