[Unicode]  The Standard Home | Site Map | Search
 

Supported Scripts

The Unicode Standard encodes scripts rather than languages per se. When writing systems for more than one language share sets of graphical symbols that have historically related derivations, the union of all of those graphical symbols is treated as a single collection of characters for encoding and is identified as a single script. Each script then serves as an inventory of graphical symbols, which are drawn upon for the writing systems of particular languages. In many cases, a single script, such as the Latin script, may be used to write tens or even hundreds of languages. In other cases, only one language employs a particular script—for example, Hangul, which is typically used only to write the Korean language. The writing systems for some languages may also use more than one script; for example, Japanese traditionally makes use of the Han (Kanji), Hiragana, and Katakana scripts, and modern Japanese usage commonly mixes in the Latin script as well.

The scripts supported by the Unicode Standard include all of those listed in the following table. The listing in the table is ordered by the version of the Unicode Standard in which a particular script was first encoded. In many instances, supplemental characters for a given script have been encoded in subsequent versions of the standard, after the initial addition of the script.

 

VersionScripts Added
1.1  
  Arabic Gujarati Lao
Armenian Gurmukhi Latin
Bengali Han Malayalam
Bopomofo Hangul Oriya
Cyrillic Hebrew Tamil
Devanagari Hiragana Telugu
Georgian Kannada Thai
Greek Katakana  
2.0  
  Tibetan    
3.0  
Braille (patterns) Mongolian Syriac
Canadian Syllabics Myanmar Thaana
Cherokee Ogham Yi
Ethiopic Runic  
Khmer Sinhala  
3.1  
Deseret Gothic Old Italic
3.2  
Buhid Tagalog  
Hanunóo Tagbanwa  
4.0  
  Cypriot Osmanya Ugaritic
Limbu Shavian  
Linear B Tai Le  
4.1  
Buginese Kharoshthi Syloti Nagri
Coptic New Tai Lue Tifinagh
Glagolitic Old Persian Cuneiform  
5.0  
  Balinese Phags-pa Sumero-Akkadian Cuneiform
N'Ko Phoenician  
5.1  
  Carian Lycian Saurashtra
Cham Lydian Sundanese
Kayah Li Ol Chiki Vai
Lepcha Rejang  
5.2  
  Avestan Inscriptional Parthian Old South Arabian
Bamum Javanese Old Turkic
Egyptian Hieroglyphs Kaithi Samaritan
Imperial Aramaic Lisu Tai Tham
Inscriptional Pahlavi Meetei Mayek Tai Viet
6.0  
Batak Brahmi Mandaic
6.1  
  Chakma Miao Takri
Meroitic Cursive Sharada  
Meroitic Hieroglyphs Sora Sompeng  
7.0  
  Bassa Vah Mahajani Pahawh Hmong
Caucasian Albanian Manichaean Palmyrene
Duployan (shorthand) Mende Kikakui Pau Cin Hau
Elbasan Modi Psalter Pahlavi
Grantha Mro Siddham
Khojki Nabataean Tirhuta
Khudawadi Old North Arabian Warang Citi
Linear A Old Permic  

 

In addition to the above scripts, a number of other collections of symbols are also encoded by Unicode. These collections include the following:

  • Numbers
  • General Diacritics
  • General Punctuation
  • General Symbols
  • Mathematical Symbols (Western and Arabic)
  • Musical Symbols (Western, Byzantine, and Ancient Greek)
  • Technical Symbols
  • Emoji
  • Dingbats
  • Arrows, Blocks, Box Drawing Forms, and Geometric Shapes
  • Game Symbols
  • Miscellaneous Symbols
  • Presentation Forms
  • Kangxi Radicals