Unicode Utilities: Character Property Index

help | character | properties | confusables | unicode-set | compare-sets | regex | bnf-regex | breaks | transform | bidi | bidi-c | idna | languageid

CategoryDatatypeSourcePropertyValues
BidirectionalBinaryUCDBidi_ControlNo (N),
Yes (Y)
Bidi_MirroredNo (N),
Yes (Y)
EnumeratedBidi_ClassShow Values
Bidi_Paired_Bracket_TypeClose (C),
None (N),
Open (O)
StringBidi_Mirroring_GlyphShow Values
Bidi_Paired_BracketShow Values
CaseBinaryICUCase_SensitiveNo (N),
Yes (Y)
UCDCase_IgnorableNo (N),
Yes (Y)
CasedNo (N),
Yes (Y)
Changes_When_CasefoldedNo (N),
Yes (Y)
Changes_When_CasemappedNo (N),
Yes (Y)
Changes_When_LowercasedNo (N),
Yes (Y)
Changes_When_TitlecasedNo (N),
Yes (Y)
Changes_When_UppercasedNo (N),
Yes (Y)
LowercaseNo (N),
Yes (Y)
Soft_DottedNo (N),
Yes (Y)
UppercaseNo (N),
Yes (Y)
UnicodeisCasedNo (N),
Yes (Y)
isCasefoldedNo (N),
Yes (Y)
isLowercaseNo (N),
Yes (Y)
isTitlecaseNo (N),
Yes (Y)
isUppercaseNo (N),
Yes (Y)
StringUCDCase_FoldingShow Values
Lowercase_MappingShow Values
Simple_Case_FoldingShow Values
Simple_Lowercase_MappingShow Values
Simple_Titlecase_MappingShow Values
Simple_Uppercase_MappingShow Values
Titlecase_MappingShow Values
Uppercase_MappingShow Values
UnicodetoCasefoldShow Values
toLowercaseShow Values
toTitlecaseShow Values
toUppercaseShow Values
CJKBinaryUCDIDS_Binary_OperatorNo (N),
Yes (Y)
IDS_Trinary_OperatorNo (N),
Yes (Y)
IdeographicNo (N),
Yes (Y)
RadicalNo (N),
Yes (Y)
Unified_IdeographNo (N),
Yes (Y)
EnumeratedX-DemoHanTypeHan, Hans, Hant,
na
StringUCDCJK_RadicalShow Values
kSimplifiedVariantShow Values
kTraditionalVariantShow Values
EmojiBinaryUTSEmojiNo (N),
Yes (Y)
Emoji_ComponentNo (N),
Yes (Y)
Emoji_Flag_SequenceNo (No),
Yes (Yes)
Emoji_Keycap_SequenceNo (No),
Yes (Yes)
Emoji_ModifierNo (N),
Yes (Y)
Emoji_Modifier_BaseNo (N),
Yes (Y)
Emoji_Modifier_SequenceNo (No),
Yes (Yes)
Emoji_PresentationNo (N),
Yes (Y)
Emoji_Tag_SequenceNo (No),
Yes (Yes)
Emoji_Zwj_SequenceNo (No),
Yes (Yes)
EnumeratedUCDRegional_IndicatorNo (N),
Yes (Y)
GeneralBinaryUCDAlphabeticNo (N),
Yes (Y)
Default_Ignorable_Code_PointNo (N),
Yes (Y)
DeprecatedNo (N),
Yes (Y)
Logical_Order_ExceptionNo (N),
Yes (Y)
Noncharacter_Code_PointNo (N),
Yes (Y)
Variation_SelectorNo (N),
Yes (Y)
White_SpaceNo (N),
Yes (Y)
CatalogAgeShow Values
BlockShow Values
ScriptShow Values
EnumeratedGeneral_CategoryShow Values
Hangul_Syllable_TypeLeading_Jamo (L), LV_Syllable (LV), LVT_Syllable (LVT),
Not_Applicable (NA),
Trailing_Jamo (T),
Vowel_Jamo (V)
Name_AliasShow Values
Named_SequencesShow Values
Named_Sequences_Prov
StringNameslistsubheadShow Values
UCDNameShow Values
Script_ExtensionsAdlam (Adlam), Adlam,Arabic,Hanifi_Rohingya,Mandaic,Manichaean,Psalter_Pahlavi,Sogdian,Syriac (Adlam,Arabic,Hanifi_Rohingya,Mandaic,Manichaean,Psalter_Pahlavi,Sogdian,Syriac), Ahom (Ahom), Anatolian_Hieroglyphs (Anatolian_Hieroglyphs), Arabic (Arabic), Arabic,Coptic (Arabic,Coptic), Arabic,Hanifi_Rohingya (Arabic,Hanifi_Rohingya), Arabic,Hanifi_Rohingya,Syriac,Thaana (Arabic,Hanifi_Rohingya,Syriac,Thaana), Arabic,Syriac (Arabic,Syriac), Arabic,Syriac,Thaana (Arabic,Syriac,Thaana), Arabic,Thaana (Arabic,Thaana), Armenian (Armenian), Armenian,Georgian (Armenian,Georgian), Avestan (Avestan),
Balinese (Balinese), Bamum (Bamum), Bassa_Vah (Bassa_Vah), Batak (Batak), Bengali (Bengali), Bengali,Chakma,Syloti_Nagri (Bengali,Chakma,Syloti_Nagri), Bengali,Devanagari (Bengali,Devanagari), Bengali,Devanagari,Dogra,Grantha,Gujarati,Gunjala_Gondi,Gurmukhi,Kannada,Khudawadi,Limbu,Mahajani,Malayalam,Masaram_Gondi,Nandinagari,Oriya,Sinhala,Syloti_Nagri,Takri,Tamil,Telugu,Tirhuta (Bengali,Devanagari,Dogra,Grantha,Gujarati,Gunjala_Gondi,Gurmukhi,Kannada,Khudawadi,Limbu,Mahajani,Malayalam,Masaram_Gondi,Nandinagari,Oriya,Sinhala,Syloti_Nagri,Takri,Tamil,Telugu,Tirhuta), Bengali,Devanagari,Dogra,Grantha,Gujarati,Gunjala_Gondi,Gurmukhi,Kannada,Khudawadi,Mahajani,Malayalam,Masaram_Gondi,Nandinagari,Oriya,Sinhala,Syloti_Nagri,Takri,Tamil,Telugu,Tirhuta (Bengali,Devanagari,Dogra,Grantha,Gujarati,Gunjala_Gondi,Gurmukhi,Kannada,Khudawadi,Mahajani,Malayalam,Masaram_Gondi,Nandinagari,Oriya,Sinhala,Syloti_Nagri,Takri,Tamil,Telugu,Tirhuta), Bengali,Devanagari,Grantha,Gujarati,Gurmukhi,Kannada,Latin,Malayalam,Oriya,Sharada,Tamil,Telugu,Tirhuta (Bengali,Devanagari,Grantha,Gujarati,Gurmukhi,Kannada,Latin,Malayalam,Oriya,Sharada,Tamil,Telugu,Tirhuta), Bengali,Devanagari,Grantha,Gujarati,Gurmukhi,Kannada,Latin,Malayalam,Oriya,Tamil,Telugu,Tirhuta (Bengali,Devanagari,Grantha,Gujarati,Gurmukhi,Kannada,Latin,Malayalam,Oriya,Tamil,Telugu,Tirhuta), Bengali,Devanagari,Grantha,Kannada (Bengali,Devanagari,Grantha,Kannada), Bengali,Devanagari,Grantha,Kannada,Nandinagari,Oriya,Telugu,Tirhuta (Bengali,Devanagari,Grantha,Kannada,Nandinagari,Oriya,Telugu,Tirhuta), Bhaiksuki (Bhaiksuki), Bopomofo (Bopomofo), Bopomofo,Han (Bopomofo,Han), Bopomofo,Han,Hangul,Hiragana,Katakana (Bopomofo,Han,Hangul,Hiragana,Katakana), Bopomofo,Han,Hangul,Hiragana,Katakana,Yi (Bopomofo,Han,Hangul,Hiragana,Katakana,Yi), Brahmi (Brahmi), Braille (Braille), Buginese (Buginese), Buginese,Javanese (Buginese,Javanese), Buhid (Buhid), Buhid,Hanunoo,Tagalog,Tagbanwa (Buhid,Hanunoo,Tagalog,Tagbanwa),
Canadian_Aboriginal (Canadian_Aboriginal), Carian (Carian), Caucasian_Albanian (Caucasian_Albanian), Chakma (Chakma), Chakma,Myanmar,Tai_Le (Chakma,Myanmar,Tai_Le), Cham (Cham), Cherokee (Cherokee), Common (Common), Coptic (Coptic), Cuneiform (Cuneiform), Cypriot (Cypriot), Cypriot,Linear_A,Linear_B (Cypriot,Linear_A,Linear_B), Cypriot,Linear_B (Cypriot,Linear_B), Cyrillic (Cyrillic), Cyrillic,Glagolitic (Cyrillic,Glagolitic), Cyrillic,Latin (Cyrillic,Latin), Cyrillic,Old_Permic (Cyrillic,Old_Permic),
Deseret (Deseret), Devanagari (Devanagari), Devanagari,Dogra,Gujarati,Gurmukhi,Kaithi,Kannada,Khojki,Khudawadi,Mahajani,Malayalam,Modi,Nandinagari,Takri,Tirhuta (Devanagari,Dogra,Gujarati,Gurmukhi,Kaithi,Kannada,Khojki,Khudawadi,Mahajani,Malayalam,Modi,Nandinagari,Takri,Tirhuta), Devanagari,Dogra,Gujarati,Gurmukhi,Kaithi,Kannada,Khojki,Khudawadi,Mahajani,Modi,Nandinagari,Takri,Tirhuta (Devanagari,Dogra,Gujarati,Gurmukhi,Kaithi,Kannada,Khojki,Khudawadi,Mahajani,Modi,Nandinagari,Takri,Tirhuta), Devanagari,Dogra,Gujarati,Gurmukhi,Kaithi,Khojki,Khudawadi,Mahajani,Modi,Takri,Tirhuta (Devanagari,Dogra,Gujarati,Gurmukhi,Kaithi,Khojki,Khudawadi,Mahajani,Modi,Takri,Tirhuta), Devanagari,Dogra,Kaithi,Mahajani (Devanagari,Dogra,Kaithi,Mahajani), Devanagari,Grantha (Devanagari,Grantha), Devanagari,Grantha,Kannada (Devanagari,Grantha,Kannada), Devanagari,Grantha,Latin (Devanagari,Grantha,Latin), Devanagari,Kannada,Malayalam,Oriya,Tamil,Telugu (Devanagari,Kannada,Malayalam,Oriya,Tamil,Telugu), Devanagari,Nandinagari (Devanagari,Nandinagari), Devanagari,Sharada (Devanagari,Sharada), Devanagari,Tamil (Devanagari,Tamil), Dogra (Dogra), Duployan (Duployan),
Egyptian_Hieroglyphs (Egyptian_Hieroglyphs), Elbasan (Elbasan), Elymaic (Elymaic), Ethiopic (Ethiopic),
Georgian (Georgian), Georgian,Latin (Georgian,Latin), Glagolitic (Glagolitic), Gothic (Gothic), Grantha (Grantha), Grantha,Tamil (Grantha,Tamil), Greek (Greek), Gujarati (Gujarati), Gujarati,Khojki (Gujarati,Khojki), Gunjala_Gondi (Gunjala_Gondi), Gurmukhi (Gurmukhi), Gurmukhi,Multani (Gurmukhi,Multani),
Han (Han), Han,Hiragana,Katakana (Han,Hiragana,Katakana), Hangul (Hangul), Hanifi_Rohingya (Hanifi_Rohingya), Hanunoo (Hanunoo), Hatran (Hatran), Hebrew (Hebrew), Hiragana (Hiragana), Hiragana,Katakana (Hiragana,Katakana),
Imperial_Aramaic (Imperial_Aramaic), Inherited (Inherited), Inscriptional_Pahlavi (Inscriptional_Pahlavi), Inscriptional_Parthian (Inscriptional_Parthian),
Javanese (Javanese),
Kaithi (Kaithi), Kannada (Kannada), Kannada,Nandinagari (Kannada,Nandinagari), Katakana (Katakana), Kayah_Li (Kayah_Li), Kayah_Li,Latin,Myanmar (Kayah_Li,Latin,Myanmar), Kharoshthi (Kharoshthi), Khmer (Khmer), Khojki (Khojki), Khudawadi (Khudawadi),
Lao (Lao), Latin (Latin), Latin,Mongolian (Latin,Mongolian), Lepcha (Lepcha), Limbu (Limbu), Linear_A (Linear_A), Linear_B (Linear_B), Lisu (Lisu), Lycian (Lycian), Lydian (Lydian),
Mahajani (Mahajani), Makasar (Makasar), Malayalam (Malayalam), Mandaic (Mandaic), Manichaean (Manichaean), Marchen (Marchen), Masaram_Gondi (Masaram_Gondi), Medefaidrin (Medefaidrin), Meetei_Mayek (Meetei_Mayek), Mende_Kikakui (Mende_Kikakui), Meroitic_Cursive (Meroitic_Cursive), Meroitic_Hieroglyphs (Meroitic_Hieroglyphs), Miao (Miao), Modi (Modi), Mongolian (Mongolian), Mongolian,Phags_Pa (Mongolian,Phags_Pa), Mro (Mro), Multani (Multani), Myanmar (Myanmar),
Nabataean (Nabataean), Nandinagari (Nandinagari), New_Tai_Lue (New_Tai_Lue), Newa (Newa), Nko (Nko), Nushu (Nushu), Nyiakeng_Puachue_Hmong (Nyiakeng_Puachue_Hmong),
Ogham (Ogham), Ol_Chiki (Ol_Chiki), Old_Hungarian (Old_Hungarian), Old_Italic (Old_Italic), Old_North_Arabian (Old_North_Arabian), Old_Permic (Old_Permic), Old_Persian (Old_Persian), Old_Sogdian (Old_Sogdian), Old_South_Arabian (Old_South_Arabian), Old_Turkic (Old_Turkic), Oriya (Oriya), Osage (Osage), Osmanya (Osmanya),
Pahawh_Hmong (Pahawh_Hmong), Palmyrene (Palmyrene), Pau_Cin_Hau (Pau_Cin_Hau), Phags_Pa (Phags_Pa), Phoenician (Phoenician), Psalter_Pahlavi (Psalter_Pahlavi),
Rejang (Rejang), Runic (Runic),
Samaritan (Samaritan), Saurashtra (Saurashtra), Sharada (Sharada), Shavian (Shavian), Siddham (Siddham), Sign_Writing (Sign_Writing), Sinhala (Sinhala), Sogdian (Sogdian), Sora_Sompeng (Sora_Sompeng), Soyombo (Soyombo), Sundanese (Sundanese), Syloti_Nagri (Syloti_Nagri), Syriac (Syriac),
Tagalog (Tagalog), Tagbanwa (Tagbanwa), Tai_Le (Tai_Le), Tai_Tham (Tai_Tham), Tai_Viet (Tai_Viet), Takri (Takri), Tamil (Tamil), Tangut (Tangut), Telugu (Telugu), Thaana (Thaana), Thai (Thai), Tibetan (Tibetan), Tifinagh (Tifinagh), Tirhuta (Tirhuta),
Ugaritic (Ugaritic), Unknown (Unknown),
Vai (Vai),
Wancho (Wancho), Warang_Citi (Warang_Citi),
Yi (Yi),
Zanabazar_Square (Zanabazar_Square)
IdentifiersBinaryUCDID_ContinueNo (N),
Yes (Y)
ID_StartNo (N),
Yes (Y)
Pattern_SyntaxNo (N),
Yes (Y)
Pattern_White_SpaceNo (N),
Yes (Y)
XID_ContinueNo (N),
Yes (Y)
XID_StartNo (N),
Yes (Y)
IDNAEnumeratedUTSIdn_2008na (na),
NV8 (nv8),
XV8 (xv8)
Idn_Statusdeviation (dv), disallowed (da), disallowed_STD3_mapped (ds3m), disallowed_STD3_valid (ds3v),
ignored (i),
mapped (m),
valid (v)
idna2003deviation, disallowed,
ignored,
mapped,
valid
idna2008CONTEXTJ, CONTEXTO,
DISALLOWED,
PVALID,
UNASSIGNED
idna2008cdeviation, disallowed,
ignored,
mapped,
valid
uts46deviation, disallowed,
ignored,
mapped,
valid
StringIdn_MappingShow Values
toIdna2003Show Values
toUts46nShow Values
toUts46tShow Values
MiscellaneousBinaryUCDDashNo (N),
Yes (Y)
DiacriticNo (N),
Yes (Y)
ExtenderNo (N),
Yes (Y)
Grapheme_BaseNo (N),
Yes (Y)
Grapheme_ExtendNo (N),
Yes (Y)
Grapheme_LinkNo (N),
Yes (Y)
HyphenNo (N),
Yes (Y)
MathNo (N),
Yes (Y)
Quotation_MarkNo (N),
Yes (Y)
Sentence_TerminalNo (N),
Yes (Y)
Terminal_PunctuationNo (N),
Yes (Y)
EnumeratedIndic_Positional_CategoryShow Values
Indic_Syllabic_CategoryShow Values
MiscellaneousISO_CommentShow Values
Unicode_1_NameShow Values
NormalizationBinaryICUNFC_InertNo (N),
Yes (Y)
NFD_InertNo (N),
Yes (Y)
NFKC_InertNo (N),
Yes (Y)
NFKD_InertNo (N),
Yes (Y)
isNFMNo,
Yes
UCDChanges_When_NFKC_CasefoldedNo (N),
Yes (Y)
Full_Composition_ExclusionNo (N),
Yes (Y)
UnicodeisNFCNo,
Yes
isNFDNo,
Yes
isNFKCNo,
Yes
isNFKDNo,
Yes
EnumeratedICULead_Canonical_Combining_ClassShow Values
Trail_Canonical_Combining_ClassShow Values
UCDCanonical_Combining_ClassShow Values
Decomposition_TypeShow Values
NFC_Quick_CheckMaybe (M),
No (N),
Yes (Y)
NFD_Quick_CheckNo (N),
Yes (Y)
NFKC_Quick_CheckMaybe (M),
No (N),
Yes (Y)
NFKD_Quick_CheckNo (N),
Yes (Y)
StringICUtoNFMShow Values
UCDNFKC_CasefoldShow Values
UnicodetoNFCShow Values
toNFDShow Values
toNFKCShow Values
toNFKDShow Values
NumericBinaryUCDASCII_Hex_DigitNo (N),
Yes (Y)
Hex_DigitNo (N),
Yes (Y)
EnumeratedNumeric_TypeDecimal (De), Digit (Di),
None (None), Numeric (Nu)
kAccountingNumericShow Values
kOtherNumericShow Values
kPrimaryNumericShow Values
NumericNumeric_ValueShow Values
RegexBinaryUTSANYNo,
Yes
ASCIINo,
Yes
alnumNo (N),
Yes (Y)
blankNo (N),
Yes (Y)
bmpNo,
Yes
graphNo (N),
Yes (Y)
printNo (N),
Yes (Y)
xdigitNo (N),
Yes (Y)
SecurityEnumeratedUTSConfusable_MAShow Values
Identifier_StatusAllowed (a),
Restricted (r)
Identifier_TypeShow Values
Shaping and RenderingBinaryICUSegment_StarterNo (N),
Yes (Y)
UCDJoin_ControlNo (N),
Yes (Y)
EnumeratedEast_Asian_WidthAmbiguous (A),
Fullwidth (F),
Halfwidth (H),
Narrow (Na), Neutral (N),
Wide (W)
Grapheme_Cluster_BreakShow Values
Joining_GroupShow Values
Joining_TypeDual_Joining (D),
Join_Causing (C),
Left_Joining (L),
Non_Joining (U),
Right_Joining (R),
Transparent (T)
Line_BreakShow Values
Prepended_Concatenation_MarkNo (N),
Yes (Y)
Sentence_BreakShow Values
Standardized_VariantShow Values
Vertical_OrientationRotated (R),
Transformed_Rotated (Tr), Transformed_Upright (Tu),
Upright (U)
Word_BreakShow Values
UCABinaryUTSucaShow Values
uca2Show Values
uca2.5Show Values
uca3Show Values
Z-OtherOtherOtherBasic_EmojiOther
Equivalent_Unified_IdeographOther
Extended_PictographicOther

Key

The Categories are from UCD Table 8. Property Summary Table, with some extended categories: Emoji, IDNA, Regex, Security, and UCA.

The Datatypes are from UCD Table 5. Property Type Key.

The Sources are:


Fonts and Display. If you don't have a good set of Unicode fonts (and modern browser), you may not be able to read some of the characters. Some suggested fonts that you can add for coverage are: Noto Fonts site, Unicode Fonts for Ancient Scripts, Large, multi-script Unicode fonts. See also: Unicode Display Problems.

Version 3.9; ICU version: 63.1; Unicode version: 12.0;