About Script Charts

The Script Charts lists Unicode characters organized by script. The same character may be in multiple charts if it has multiple Script_Extensions values or has a compatibility decomposition.

Hovering over a cell will show the name of the character in that cell; characters added in the last release are in yellow.

To properly view these charts, you need to have an updated browser, and Unicode fonts (such as the Noto Fonts) that cover the characters you are interested in.

Details. The characters are listed in the charts according to their compatibility decompositions, which may be either a single character or be a multi-character string.

  1. A single character decomposition with a single explicit Script property value in its Script_Extensions set is listed in a chart for that script
    • For example, the Greek chart lists
    • U+03B1 (α) GREEK SMALL LETTER ALPHA, because it has Script_Extensions={Greek}
    • U+0342 ( ͂) COMBINING GREEK PERISPOMENI, because it has Script_Extensions={Greek} — even though it has Script=Inherited
  2. A single character decomposition with multiple explicit Script values in its Script_Extensions set is listed in each chart for those Script values.
    • For example, the Armenian and the Georgian charts both list
    • U+0589 (։) ARMENIAN FULL STOP, because it has Script_Extensions={Armenian, Georgian}
  3. A single character decomposition with a single implicit Script property value in its Script_Extensions set is listed in charts at the end according to the character’s General_Category value
    • For example, the Number-Decimal chart lists
    • U+0030 (0 ) DIGIT ZERO, because it has Script_Extensions={Common} — and no decomposition into characters with other Script values
  4. A string is listed under the charts for each character in decomposition, skipping implicit Script values if there is any explicit Script value
    • For example, the Latin chart lists
    • U+3393 (㎓) SQUARE GHZ, because it has a compatibility decomposition [G H z] into a sequence of Latin letters — even though it has Script_Extensions={Common}
    • U+33C7 (㏇) SQUARE CO, because it has a compatibility decomposition [C o .] into a sequence containing Latin letters, and a period — which is skipped because it has Script_Extensions={Common}
    • The Punctuation-Other chart lists
    • U+2026 (…) HORIZONTAL ELLIPSIS, because the characters in its compatibility decomposition [. . .] have no explicit script values, only Script_Extensions={Common}

For more information about the Script and Script_Extensions properties, and explicit versus implicit Script values, see UAX #24: Unicode Script Property.