[Unicode]  The Standard Home | Site Map | Search
 

Supported Scripts

The Unicode Character Standard primarily encodes scripts rather than languages. That is, where more than one language shares a set of symbols that have a historically related derivation, the union of the set of symbols of each such language is unified into a single collection identified as a single script. These collections of symbols (i.e., scripts) then serve as inventories of symbols which are drawn upon to write particular languages. In many cases, a single script may serve to write tens or even hundreds of languages (e.g., the Latin script). In other cases only one language employs a particular script (e.g., Hangul, which is used only for the Korean language). The writing systems for some languages may also make use of more than one script; for example, Japanese traditionally makes use of the Han (Kanji), Hiragana, and Katakana scripts, and modern Japanese usage commonly mixes in the Latin script as well.

The primary scripts currently supported by Unicode 5.2.0 are:

  • Arabic
  • Aramaic, Imperial
  • Armenian
  • Avestan
  • Balinese
  • Bamum
  • Bengali
  • Bopomofo
  • Buginese
  • Buhid
  • Canadian Syllabics
  • Carian
  • Cham
  • Cherokee
  • Coptic
  • Cypriot
  • Cyrillic
  • Deseret
  • Devanagari
  • Egyptian Hieroglyphs
  • Ethiopic
  • Georgian
  • Glagolitic
  • Gothic
  • Greek
  • Gujarati
  • Gurmukhi
  • Han
  • Hangul
  • Hanunóo
  • Hebrew
  • Hiragana
  • Javanese
  • Kaithi
  • Kannada
  • Katakana
  • Kayah Li
  • Kharoshthi
  • Khmer
  • Lao
  • Latin
  • Lepcha (Rong)
  • Limbu
  • Linear B
  • Lisu
  • Lycian
  • Lydian
  • Malayalam
  • Meetei Mayek
  • Mongolian
  • Myanmar
  • New Tai Lue
  • N'Ko
  • Ogham
  • Ol Chiki
  • Old Italic (Etruscan)
  • Old Persian Cuneiform
  • Old South Arabian
  • Old Turkic
  • Osmanya
  • Oriya
  • Pahlavi, Inscriptional
  • Parthian, Inscriptional
  • Phags-pa
  • Phoenician
  • Rejang
  • Runic
  • Saurashtra
  • Samaritan
  • Shavian
  • Sinhala
  • Sumero-Akkadian Cuneiform
  • Sundanese
  • Syloti Nagri
  • Syriac
  • Tagalog
  • Tagbanwa
  • Tai Le
  • Tai Tham
  • Tai Viet
  • Tamil
  • Telugu
  • Thaana
  • Thai
  • Tibetan
  • Tifinagh (Berber)
  • Ugaritic
  • Vai
  • Yi

In addition to the above scripts, a number of other collections of symbols are also encoded by Unicode. These collections consist of the following:

  • Numbers
  • General Diacritics
  • General Punctuation
  • General Symbols
  • Mathematical Symbols
  • Musical Symbols (Western, Byzantine, and Ancient Greek)
  • Technical Symbols
  • Dingbats
  • Arrows, Blocks, Box Drawing Forms, and Geometric Shapes
  • Game Symbols
  • Miscellaneous Symbols
  • Presentation Forms
  • Braille Patterns
  • Kangxi Radicals