Emoji and Dingbats
Q: What are emoji?
A: Emoji are “picture characters” most frequently associated with cellular telephone usage in Japan, but also used in other East Asian countries and in other contexts. Use of emoji is also growing outside East Asia; some emoji-enabling applications for smartphones are very popular.
Emoji are often pictographs—images of things such as faces, weather, vehicles and buildings, food and drink, animals and plants—or icons that represent emotions, feelings, or activities. In cellular phone usage, many emoji characters are presented in color (sometimes as a multicolor image), and some are presented in animated form, usually as a repeating sequence of two to four images—for example, a pulsing red heart. [PE]
Q: Are emoji the same thing as emoticons?
A: Not exactly. Emoticons (from “emotion” plus “icon”) are specifically intended to depict facial expression or body posture as a way of conveying emotion or attitude in e-mail and text messages. They originated as ASCII character combinations such as :-) to indicate a smile—and by extension, a joke—and :-( to indicate a frown. In East Asia, a number of more elaborate sequences have been developed, such as (")(-_-)(") showing an upset face with hands raised. Over time, many systems began replacing such sequences with images, and also began providing ways to input emoticon images directly, such as a menu or palette. The emoji sets used by Japanese cell phone carriers contain a large number of characters for emoticon images, along with many other non-emoticon emoji.[PE]
Q: What are the most popular emoji characters?
A. The emojitracker.com tracks the realtime use of many emoji in Twitter, so you can see the most and least used emoji characters there.
Q: Can you point me to some examples of emoji characters in Unicode?
A: The emoji are spread throughout many blocks of Unicode. A good sample can be found in Miscellaneous Symbols And Pictographs. Notice that in these charts emoji are shown in black and white, whereas they typically appear in color on mobile phones and computers.
Q: Do emoji characters have to look the same wherever they are used?
A: No, they don’t have to look the same. For example, here are just some of the possible images for U+1F36D LOLLIPOP, U+1F36E CUSTARD, U+1F36F HONEY POT, and U+1F370 SHORTCAKE:
Q: What about diversity?
A: As with the examples of emoji characters representing food items above, an emoji character like U+1F474 OLDER MAN can vary in appearance depending on the font. Unicode does not require a particular racial or ethnic appearance—or for that matter, a particular hair style: bald or hirsute. However, because there are concerns regarding the emoji characters for people, proposals are being developed by Unicode Consortium members to provide more diversity.
Q: How were emoji encoded on cell phones?
A: Cell phone carriers in Japan have long encoded some emoji in Shift-JIS and ISO-2022 as extensions of the JIS X 0208 character set. A core set of 722 emoji constitutes the union of the emoji sets encoded in this way by the three most popular cell phone carriers in Japan. These core emoji characters are interchanged as plain text by millions of people daily (in SMS text messages and e-mail subject lines, for example), and need to be handled by e-mail systems, search engines, publishing systems, databases, and so on. For emoji beyond this core set (including those that are still being created), vendors have added rich text support, and use approaches such as embedded graphics. Similar techniques (embedded graphics or escape tags designating emoji) are also typically used for emoji support in China and the Republic of Korea. [PE]
Q: How were emoji originally encoded in Unicode?
A: 114 characters in the core emoji set are mapped to sequences of one or more characters available in Unicode before Version 6.0. The other 608 characters in the core emoji set are mapped to sequences of one or more characters added in Unicode 6.0, primarily in the blocks for Miscellaneous Symbols and Pictographs, Emoticons, Transport and Map Symbols, but also in blocks such as Dingbats and Technical Symbols. There is no block set aside specifically for emoji.
Characters that are separate in the extended JIS X 0208 sets used by the three major cell phone carriers in Japan are mapped to separate characters in Unicode in what is known as the Emoji Source Separation Rule. For example, the emoji core set includes a character mapped to U+1F3B5 MUSICAL NOTE; this could not be unified with U+266A EIGHTH NOTE, because both exist as separate characters in the extended JIS sets used by all three of the major cell phone carriers in Japan.
Because characters in the core emoji set are treated as pictographs, they are encoded in Unicode based primarily on their general appearance, not on an intended semantic. In fact, when used as emoji, many of these characters acquire multiple meanings based on their appearance; for example, an emoji character for “bank” which includes the letters “BK” has taken on the secondary meaning “bakkureru” (a slang term for evading one’s responsibilities). The identity of characters in the emoji core set is defined primarily by their mapping to Unicode, as specified in the file EmojiSources.txt. [PE]
Q: How many emoji characters are in Unicode now?
A. This question does not have a simple answer, because there is no clear line separating which pictographs should and should not be displayed with a typical emoji style. But roughly speaking, aside from the core set there are about 300 other characters in Unicode 6.3 that could also reasonably be displayed with typical emoji style (colored), such as U+1F46D TWO WOMEN HOLDING HANDS. There are also ways of representing emoji for national flags, adding about 240 others.
Q: Which characters were added most recently?
A. There were 13 new faces added in Unicode 6.1, such as U+1F634 SLEEPING FACE.
Q: How about the near future?
A. Unicode 7.0 is planned for release in July of 2014. It includes about 250 pictographs that might reasonably have an emoji presentation.
Q: How should emoji be displayed?
A: While emoji symbols may be presented using color and animation, they need not be. Because many characters in the core emoji sets are unified with Unicode characters that originally came from other sources, there is no way based on character code alone to tell whether a character should be presented using an “emoji” style; that decision depends on context. [PE]
Q: Is there any way to control the “emoji” style?
A: Certain characters can be followed by a special character called a variation selector to request a particular appearance: U+FE0F for the emoji style (typically colored), and U+FE0E for the text style (black and white). Only certain characters qualify: the exact characters are listed in the file StandardizedVariants.
Q: What about characters whose name specifies a color?
A: Some of the characters from the core emoji sets have names that include a color term, for example, BLUE HEART or ORANGE BOOK. These color terms in the names do not imply any requirement about how a character must be presented; they are intended only to help identify the corresponding character in the core emoji sets. Even names of symbols such as BLACK MEDIUM SQUARE or WHITE MEDIUM SQUARE are not meant to indicate that the corresponding character must be presented in black or white, respectively; rather, the use of black and white is generally just to contrast filled versus outline shapes, or a darker color fill versus a lighter color fill. [PE]
Q: What is the difference between emoji and dingbats?
A: Most of the characters in the Dingbats block are derived from a well-established set of glyphs, the ITC Zapf Dingbats series 100, which constitutes the industry standard “Zapf Dingbat” font currently available in most laser printers. Emoji and dingbats have some similarities (and a few core emoji characters are mapped to characters in the Dingbats block). However, while there is often a great deal of flexibility in the range of glyph shapes that may be used for presentation of emoji, most characters in the Dingbats block are expected to be presented with glyph shapes that closely align with those shown in the Unicode Standard. [PE]
Q: How do emoji relate to other Japanese symbol sets?
A: Other symbol sets defined in Japanese standards overlap extensively with the characters in the core emoji set. For example:
Many characters from the Japanese television standard ARIB STD-B24 2007 (from the Association of Radio Industries and Businesses) were added to Unicode in Version 5.2, and are mapped to characters in the core emoji set.
The Japanese recording industry standard RIS-506-1996 specifies an extension of Shift-JIS for use in Music CD text, and includes a number of characters similar to those in the core emoji set. [PE]
Q: What about Wingdings and Webdings? Are they encoded, and if not, why?
A: Many of the symbols in Microsoft’s Webdings and Wingdings series fonts are in Unicode 6.3 or earlier. The remainder of these are in Unicode 7.0, planned for release in July of 2014.
Q: Does the Unicode Consortium design the emoji used on my phone and elsewhere?
A: No, the Unicode Consortium does not design emoji. The emoji encoded in the Unicode Standard were added to Unicode because they were in prior use as smart phone characters for text-messaging in a number of Japanese manufacturers' corporate standards, and other places.
Q: I’d like my favorite emoji added to my phone. Can the Unicode Consortium add it?
A: The Unicode Consortium does not make or sell fonts, images, or icons. For concerns about the emoji and flag symbols available in any particular application or mobile platform, please contact the manufacturer. Their software determines what characters are available on your device.
The Unicode Consortium encourages the use of embedded graphics where possible, since they allow much more freedom of expression. For example, see phone icons.
Q: How can I get the Unicode Consortium to add a Unicode emoji?
A: Adding characters to an encoding standard involves a long, formal process. To be considered, characters must be in widespread use, as textual elements. The emoji and various symbols were added to Unicode because of their use as characters for text-messaging in a number of Japanese manufacturers’ corporate standards, and other places. If you wish to submit emoji or any other character for consideration for encoding, see the detailed instructions about how to submit character encoding proposals. It may be helpful to see the Unicode Forum or the Unicode Mail List, as well.
Q: Why is the process so long and complicated?
A: Unicode is the foundation for all modern software: that’s how all mobile phones, desktops, and other computers represent all text of every language. You are using Unicode every time you type a key on your phone or desktop computer, and every time you look at a web page or text in an application.
It is thus very important that the standard be stable, and that every character that goes into it be scrutinized carefully.
Q: Why can’t I find my national flag in my mobile application or on my smart phone?
A: For concerns about the emoji and flag symbols available in any particular application or mobile platform, please contact the manufacturer. Their software determines what characters are available on your device.
Q: But the Unicode Standard includes other flags, why don’t you include my flag?
A: The Unicode Standard does not encode single characters directly representing the symbols, icons, or images for any national flag. It does encode a set of regional indicator symbols. These can be used in pairs to represent any territory that has an ISO 3166-1 two-letter code, such as “DE” for Germany. The pairs are typically displayed as national flags: there are currently 249 such combinations.
However, the Unicode Standard itself does not prescribe which regional indicator pairs are represented with flags in fonts and input palettes on any given device. Please see Section 15.10, Enclosed and Square, in the Unicode Standard.