Emoji and Dingbats
Q: What are emoji?
A: Emoji are “picture characters” most frequently associated with cellular telephone usage in Japan, but also used in other East Asian countries and in other contexts. Use of emoji is also growing outside East Asia; some emoji-enabling applications for smartphones are very popular. Emoji are often pictographs—images of things such as faces, weather, vehicles and buildings, food and drink, animals and plants—or icons that represent emotions, feelings, or activities. In cellular phone usage, many emoji characters are presented in color (sometimes as a multicolor image), and some are presented in animated form, usually as a repeating sequence of two to four images—for example, a pulsing red heart. [PE]
Q: Are emoji the same thing as emoticons?
A: Not exactly. Emoticons (from “emotion” plus “icon”) are specifically intended to depict facial expression or body posture as a way of conveying emotion or attitude in e-mail and text messages. They originated as ASCII character combinations such as :-) to indicate a smile—and by extension, a joke—and :-( to indicate a frown. In East Asia, a number of more elaborate sequences have been developed, such as (")(-_-)(") showing an upset face with hands raised. Over time, many systems began replacing such sequences with images, and also began providing ways to input emoticon images directly, such as a menu or palette. The emoji sets used by Japanese cell phone carriers contain a large number of characters for emoticon images. [PE]
Q: How have emoji been encoded on cell phones?
A: Cell phone carriers in Japan have long encoded some emoji in Shift-JIS and ISO-2022 as extensions of the JIS X 0208 character set. A core set of 722 emoji constitutes the union of the emoji sets encoded in this way by the three most popular cell phone carriers in Japan. These core emoji characters are interchanged as plain text by millions of people daily (in SMS text messages and e-mail subject lines, for example) , and need to be handled by e-mail systems, search engines, publishing systems, databases, and so on. For emoji beyond this core set (including those that are still being created), vendors have added rich text support, and use approaches such as embedded graphics. Similar techniques (embedded graphics or escape tags designating emoji) are also typically used for emoji support in China and the Republic of Korea. [PE]
Q: How are emoji encoded in Unicode?
A: 114 characters in the core emoji set are mapped to sequences of one or more characters available in Unicode before Version 6.0. The other 608 characters in the core emoji set are mapped to sequences of one or more characters added in Unicode 6.0, primarily in the blocks for Miscellaneous Symbols and Pictographs, Emoticons, Transport and Map Symbols, but also in blocks such as Dingbats and Technical Symbols. There is no block set aside specifically for emoji.
Characters that are separate in the extended JIS X 0208 sets used by the three major cell phone carriers in Japan are mapped to separate characters in Unicode in what is known as the Emoji Source Separation Rule. For example, The emoji core set includes a character mapped to U+1F3B5 MUSICAL NOTE; this could not be unified with U+226A EIGHTH NOTE, because both exist as separate characters in the extended JIS sets used by all three of the major cell phone carriers in Japan.
Because characters in the core emoji set are treated as pictographs, they are encoded in Unicode based primarily on their general appearance, not on an intended semantic. In fact, when used as emoji, many of these characters acquire multiple meanings based on their appearance; for example, an emoji character for “bank” which includes the letters “BK” has taken on the secondary meaning “bakkureru” (a slang term for evading one's responsibilities). The identity of characters in the emoji core set is defined primarily by their mapping to Unicode, as specified in the file EmojiSources.txt. [PE]
Q: How should emoji be displayed?
A: While emoji symbols may be presented using color and animation, they need not be. Because many characters in the core emoji sets are unified with Unicode characters that originally came from other sources, there is no way based on character code alone to tell whether a character should be presented using an “emoji” style; that decision depends on context. [PE]
Q: What about characters whose name specifies a color?
A: Some of the characters from the core emoji sets have names that include a color term, for example, BLUE HEART or ORANGE BOOK. These color terms in the names do not imply any requirement about how a character must be presented; they are intended only to help identify the corresponding character in the core emoji sets. Even names of symbols such as BLACK MEDIUM SQUARE or WHITE MEDIUM SQUARE are not meant to indicate that the corresponding character must be presented in black or white, respectively; rather, the use of black and white is generally just to contrast filled versus outline shapes, or a darker color fill versus a lighter color fill. [PE]
Q: What is the difference between emoji and dingbats?
A: Most of the characters in the Dingbats block are derived from a well-established set of glyphs, the ITC Zapf Dingbats series 100, which constitutes the industry standard “Zapf Dingbat” font currently available in most laser printers. Emoji and dingbats have some similarities (and a few core emoji characters are mapped to characters in the Dingbats block). However, while there is often a great deal of flexibility in the range of glyph shapes that may be used for presentation of emoji, most characters in the Dingbats block are expected to be presented with glyph shapes that closely align with those shown in the Unicode Standard. [PE]
Q: How do emoji relate to other Japanese symbol sets?
A: Other symbol sets defined in Japanese standards overlap extensively with the characters in the core emoji set. For example:
Many characters from the Japanese television standard ARIB STD-B24 2007 (from the Association of Radio Industries and Businesses) were added to Unicode in Version 5.2, and are mapped to characters in the core emoji set.
The Japanese recording industry standard RIS-506-1996 specifies an extension of Shift-JIS for use in Music CD text, and includes a number of characters similar to those in the core emoji set. [PE]
Q: What about Wingdings and Webdings? Are they encoded, and if not, why?
A: Many of the symbols in Microsoft's Webdings and Wingdings series fonts have already been encoded in Unicode. A proposal for encoding the remainder of these symbols has been approved by the Unicode Technical Committee and is currently working its way through the ISO balloting process. [PE]
Q: Does the Unicode Consortium design the emoji used on my phone and elsewhere?
A: No, the Unicode Consortium does not design emoji. The emoji encoded in the Unicode Standard were added to Unicode because they were in prior use as "smart phone" characters for text-messaging in a number of Japanese manufacturers' corporate standards, and other places.
Q: I'd like my favorite emoji added. Can the Unicode Consortium add it?
A: Probably not. We do not make or sell fonts, images, or icons, but you may want to consider using image files or icons instead. The suggested means of adding colorful new images and emoji to smart phone applications is for you to use such
image files or icons.
Adding characters to an encoding standard involves a long, formal process. To be considered, characters must be in widespread use, as textual elements. The emoji and various symbols were added to Unicode because of their use as characters for text-messaging in a number of Japanese manufacturers' corporate standards, and other places. If you wish to submit emoji or any other character for consideration for encoding, see the detailed instructions about how to submit character encoding proposals. It may be helpful to see the Unicode Forum or the Unicode Mail List, as well.
Q: Why can't I find my national flag in my mobile application or on my smart phone?
A: For concerns about the emoji and flag symbols available in any particular application or mobile platform, please contact the manufacturer. Their software determines what characters are available on your device. If you want to use flag symbols on mobile devices, then images are recommended as a way to extend the set of flag symbols. Flag icons for all countries are easily available for free download from many sources. Simply search for "flag icons" on the web and you will find them.
Q: But the Unicode Standard includes other flags, why don't you include my flag?
A: The Unicode Standard does not encode characters directly representing the symbols, icons, or images for any national flag. Those few "flag symbols" encoded as characters are either generic dingbat symbols for a flag, such as U+2690 WHITE FLAG, or symbols often associated with particular meanings in map contexts, such as U+26F3 FLAG IN HOLE for a golf course, or U+1F38C CROSSED FLAGS for a battlefield.
A set of regional indicator symbols are also encoded. These can be used in pairs to represent any territory that has an ISO 3166-1 two-letter code, such as "DE" for Germany. The pairs are typically displayed as national flags. However, the Unicode Standard itself does not prescribe which regional indicator pairs are represented with flags in fonts and input palettes on any given device. Please see Section 15.10, Enclosed and Square, in the Unicode Standard.