

L2/09-029 R3



Script Edge Cases

Mark Davis, 2009-02-02 (R3)

Live doc: https://docs.google.com/Doc?id=dfqr8rd5_371fzqg88g8



In most cases, the assignment of scripts works quite well. There are, however, some edge cases that people may stumble over (as I did). I suggest that we document such cases in a new section of TR24.



In particular, the characters that are associated with multiple scripts may need to be grouped with each of the scripts in particular applications. (See, for example, the mockup at http://macchiato.com/picker/MyApplication.html.)



The following is a suggested list of characters for such a section, with guesses as to how to document them. (I make give some suggested property changes, but even if we don't make any, I think it is important to document the situation in one place.) The characters are all listed here for discussion - for the actual text they could be more compactly represented by ranges.



Note: work on text based on item from Ken - characters are used in writing systems that are used with these multiple scripts. Also, that usage can change.



The tentative values I have are marked with @..., just to make it easy to extract the information in tools. As I said before,



Note: if a PDF is in the doc registry, an HTML version should be there also, so that the links work. In the notes below, "explicit script" means a script other than Common or Inherited.





@Latin

(inc. Phonetic alphabets)

Am guessing that these are functionally Latin script (including phonetic alphabets like IPA, UPA). Are they used with other scripts? Cyrillic? Greek?



Basic Latin - ASCII punctuation and symbols

U+005E ( ^ ) CIRCUMFLEX ACCENT

U+0060 ( ` ) GRAVE ACCENT

Latin 1 Supplement - Latin-1 punctuation and symbols

U+00A8 ( ¨ ) DIAERESIS

U+00AF ( ¯ ) MACRON

U+00B4 ( ´ ) ACUTE ACCENT

U+00B8 ( ¸ ) CEDILLA

Spacing Modifier Letters - Miscellaneous phonetic modifiers

U+02B9 ( ʹ ) MODIFIER LETTER PRIME

U+02BA ( ʺ ) MODIFIER LETTER DOUBLE PRIME

U+02BB ( ʻ ) MODIFIER LETTER TURNED COMMA

U+02BD ( ʽ ) MODIFIER LETTER REVERSED COMMA

U+02BE ( ʾ ) MODIFIER LETTER RIGHT HALF RING

U+02BF ( ʿ ) MODIFIER LETTER LEFT HALF RING

U+02C0 ( ˀ ) MODIFIER LETTER GLOTTAL STOP

U+02C1 ( ˁ ) MODIFIER LETTER REVERSED GLOTTAL STOP

U+02C2 ( ˂ ) MODIFIER LETTER LEFT ARROWHEAD

U+02C3 ( ˃ ) MODIFIER LETTER RIGHT ARROWHEAD

U+02C4 ( ˄ ) MODIFIER LETTER UP ARROWHEAD

U+02C5 ( ˅ ) MODIFIER LETTER DOWN ARROWHEAD

U+02C6 ( ˆ ) MODIFIER LETTER CIRCUMFLEX ACCENT

U+02C7 ( ˇ ) CARON

U+02C8 ( ˈ ) MODIFIER LETTER VERTICAL LINE

U+02C9 ( ˉ ) MODIFIER LETTER MACRON

U+02CA ( ˊ ) MODIFIER LETTER ACUTE ACCENT

U+02CB ( ˋ ) MODIFIER LETTER GRAVE ACCENT

U+02CC ( ˌ ) MODIFIER LETTER LOW VERTICAL LINE

U+02CD ( ˍ ) MODIFIER LETTER LOW MACRON

U+02CE ( ˎ ) MODIFIER LETTER LOW GRAVE ACCENT

U+02CF ( ˏ ) MODIFIER LETTER LOW ACUTE ACCENT

U+02D0 ( ː ) MODIFIER LETTER TRIANGULAR COLON

U+02D1 ( ˑ ) MODIFIER LETTER HALF TRIANGULAR COLON

U+02D2 ( ˒ ) MODIFIER LETTER CENTRED RIGHT HALF RING

U+02D3 ( ˓ ) MODIFIER LETTER CENTRED LEFT HALF RING

U+02D4 ( ˔ ) MODIFIER LETTER UP TACK

U+02D5 ( ˕ ) MODIFIER LETTER DOWN TACK

U+02D6 ( ˖ ) MODIFIER LETTER PLUS SIGN

U+02D7 ( ˗ ) MODIFIER LETTER MINUS SIGN

Spacing Modifier Letters - Spacing clones of diacritics

U+02D8 ( ˘ ) BREVE

U+02D9 ( ˙ ) DOT ABOVE

U+02DA ( ˚ ) RING ABOVE

U+02DB ( ˛ ) OGONEK

U+02DC ( ˜ ) SMALL TILDE

U+02DD ( ˝ ) DOUBLE ACUTE ACCENT

Spacing Modifier Letters - Additions based on 1989 IPA

U+02DE ( ˞ ) MODIFIER LETTER RHOTIC HOOK

U+02DF ( ˟ ) MODIFIER LETTER CROSS ACCENT

Spacing Modifier Letters - Tone letters

U+02E5 ( ˥ ) MODIFIER LETTER EXTRA-HIGH TONE BAR

U+02E6 ( ˦ ) MODIFIER LETTER HIGH TONE BAR

U+02E7 ( ˧ ) MODIFIER LETTER MID TONE BAR

U+02E8 ( ˨ ) MODIFIER LETTER LOW TONE BAR

U+02E9 ( ˩ ) MODIFIER LETTER EXTRA-LOW TONE BAR

Spacing Modifier Letters - IPA modifiers

U+02EC ( ˬ ) MODIFIER LETTER VOICING

U+02ED ( ˭ ) MODIFIER LETTER UNASPIRATED

Spacing Modifier Letters - Other modifier letter

U+02EE ( ˮ ) MODIFIER LETTER DOUBLE APOSTROPHE



(The following set appears to be for use in Latin/IPA according to WG2 docs)

Modifier Tone Letters - Corner tone marks for Chinese

U+A700 ( ꜀ ) MODIFIER LETTER CHINESE TONE YIN PING

U+A701 ( ꜁ ) MODIFIER LETTER CHINESE TONE YANG PING

U+A702 ( ꜂ ) MODIFIER LETTER CHINESE TONE YIN SHANG

U+A703 ( ꜃ ) MODIFIER LETTER CHINESE TONE YANG SHANG

U+A704 ( ꜄ ) MODIFIER LETTER CHINESE TONE YIN QU

U+A705 ( ꜅ ) MODIFIER LETTER CHINESE TONE YANG QU

U+A706 ( ꜆ ) MODIFIER LETTER CHINESE TONE YIN RU

U+A707 ( ꜇ ) MODIFIER LETTER CHINESE TONE YANG RU

Modifier Tone Letters - Dotted tone letters

U+A708 ( ꜈ ) MODIFIER LETTER EXTRA-HIGH DOTTED TONE BAR

U+A709 ( ꜉ ) MODIFIER LETTER HIGH DOTTED TONE BAR

U+A70A ( ꜊ ) MODIFIER LETTER MID DOTTED TONE BAR

U+A70B ( ꜋ ) MODIFIER LETTER LOW DOTTED TONE BAR

U+A70C ( ꜌ ) MODIFIER LETTER EXTRA-LOW DOTTED TONE BAR

U+A70D ( ꜍ ) MODIFIER LETTER EXTRA-HIGH DOTTED LEFT-STEM TONE BAR

U+A70E ( ꜎ ) MODIFIER LETTER HIGH DOTTED LEFT-STEM TONE BAR

U+A70F ( ꜏ ) MODIFIER LETTER MID DOTTED LEFT-STEM TONE BAR

U+A710 ( ꜐ ) MODIFIER LETTER LOW DOTTED LEFT-STEM TONE BAR

U+A711 ( ꜑ ) MODIFIER LETTER EXTRA-LOW DOTTED LEFT-STEM TONE BAR

Modifier Tone Letters - Left-stem tone letters

U+A712 ( ꜒ ) MODIFIER LETTER EXTRA-HIGH LEFT-STEM TONE BAR

U+A713 ( ꜓ ) MODIFIER LETTER HIGH LEFT-STEM TONE BAR

U+A714 ( ꜔ ) MODIFIER LETTER MID LEFT-STEM TONE BAR

U+A715 ( ꜕ ) MODIFIER LETTER LOW LEFT-STEM TONE BAR

U+A716 ( ꜖ ) MODIFIER LETTER EXTRA-LOW LEFT-STEM TONE BAR

Modifier Tone Letters - Chinantec tone marks

U+A717 ( ꜗ ) MODIFIER LETTER DOT VERTICAL BAR

U+A718 ( ꜘ ) MODIFIER LETTER DOT SLASH

U+A719 ( ꜙ ) MODIFIER LETTER DOT HORIZONTAL BAR

U+A71A ( ꜚ ) MODIFIER LETTER LOWER RIGHT CORNER ANGLE

Modifier Tone Letters - Africanist tone letters

U+A71B ( ꜛ ) MODIFIER LETTER RAISED UP ARROW

U+A71C ( ꜜ ) MODIFIER LETTER RAISED DOWN ARROW

U+A71D ( ꜝ ) MODIFIER LETTER RAISED EXCLAMATION MARK

U+A71E ( ꜞ ) MODIFIER LETTER RAISED INVERTED EXCLAMATION MARK

U+A71F ( ꜟ ) MODIFIER LETTER LOW INVERTED EXCLAMATION MARK

Latin Extended D - Modifier letters

U+A788 ( ꞈ ) MODIFIER LETTER LOW CIRCUMFLEX ACCENT

U+A789 ( ꞉ ) MODIFIER LETTER COLON

U+A78A ( ꞊ ) MODIFIER LETTER SHORT EQUALS SIGN

Spacing Modifier Letters - UPA modifiers

U+02EF ( ˯ ) MODIFIER LETTER LOW DOWN ARROWHEAD

U+02F0 ( ˰ ) MODIFIER LETTER LOW UP ARROWHEAD

U+02F1 ( ˱ ) MODIFIER LETTER LOW LEFT ARROWHEAD

U+02F2 ( ˲ ) MODIFIER LETTER LOW RIGHT ARROWHEAD

U+02F3 ( ˳ ) MODIFIER LETTER LOW RING

U+02F4 ( ˴ ) MODIFIER LETTER MIDDLE GRAVE ACCENT

U+02F5 ( ˵ ) MODIFIER LETTER MIDDLE DOUBLE GRAVE ACCENT

U+02F6 ( ˶ ) MODIFIER LETTER MIDDLE DOUBLE ACUTE ACCENT

U+02F7 ( ˷ ) MODIFIER LETTER LOW TILDE

U+02F8 ( ˸ ) MODIFIER LETTER RAISED COLON

U+02F9 ( ˹ ) MODIFIER LETTER BEGIN HIGH TONE

U+02FA ( ˺ ) MODIFIER LETTER END HIGH TONE

U+02FB ( ˻ ) MODIFIER LETTER BEGIN LOW TONE

U+02FC ( ˼ ) MODIFIER LETTER END LOW TONE

U+02FD ( ˽ ) MODIFIER LETTER SHELF

U+02FE ( ˾ ) MODIFIER LETTER OPEN SHELF

U+02FF ( ˿ ) MODIFIER LETTER LOW LEFT ARROW

Latin Extended D - Additions for UPA

U+A720 ( ꜠ ) MODIFIER LETTER STRESS AND HIGH TONE

U+A721 ( ꜡ ) MODIFIER LETTER STRESS AND LOW TONE



Letterlike Symbols - Letterlike symbols

U+2102 ( ℂ ) DOUBLE-STRUCK CAPITAL C

U+210A ( ℊ ) SCRIPT SMALL G

U+210B ( ℋ ) SCRIPT CAPITAL H

U+210C ( ℌ ) BLACK-LETTER CAPITAL H

U+210D ( ℍ ) DOUBLE-STRUCK CAPITAL H

U+210E ( ℎ ) PLANCK CONSTANT

U+210F ( ℏ ) PLANCK CONSTANT OVER TWO PI

U+2110 ( ℐ ) SCRIPT CAPITAL I

U+2111 ( ℑ ) BLACK-LETTER CAPITAL I

U+2112 ( ℒ ) SCRIPT CAPITAL L

U+2113 ( ℓ ) SCRIPT SMALL L

U+2115 ( ℕ ) DOUBLE-STRUCK CAPITAL N

U+2119 ( ℙ ) DOUBLE-STRUCK CAPITAL P

U+211A ( ℚ ) DOUBLE-STRUCK CAPITAL Q

U+211B ( ℛ ) SCRIPT CAPITAL R

U+211C ( ℜ ) BLACK-LETTER CAPITAL R

U+211D ( ℝ ) DOUBLE-STRUCK CAPITAL R

U+2124 ( ℤ ) DOUBLE-STRUCK CAPITAL Z

U+2128 ( ℨ ) BLACK-LETTER CAPITAL Z

U+212C ( ℬ ) SCRIPT CAPITAL B

U+212D ( ℭ ) BLACK-LETTER CAPITAL C

U+212F ( ℯ ) SCRIPT SMALL E

U+2130 ( ℰ ) SCRIPT CAPITAL E

U+2131 ( ℱ ) SCRIPT CAPITAL F

U+2133 ( ℳ ) SCRIPT CAPITAL M

U+2134 ( ℴ ) SCRIPT SMALL O

Letterlike Symbols - Double-struck italic math symbols

U+2145 ( ⅅ ) DOUBLE-STRUCK ITALIC CAPITAL D

U+2146 ( ⅆ ) DOUBLE-STRUCK ITALIC SMALL D

U+2147 ( ⅇ ) DOUBLE-STRUCK ITALIC SMALL E

U+2148 ( ⅈ ) DOUBLE-STRUCK ITALIC SMALL I

U+2149 ( ⅉ ) DOUBLE-STRUCK ITALIC SMALL J

Enclosed Alphanumerics - Parenthesized Latin letters

U+249C ( ⒜ ) PARENTHESIZED LATIN SMALL LETTER A

U+249D ( ⒝ ) PARENTHESIZED LATIN SMALL LETTER B

U+249E ( ⒞ ) PARENTHESIZED LATIN SMALL LETTER C

U+249F ( ⒟ ) PARENTHESIZED LATIN SMALL LETTER D

U+24A0 ( ⒠ ) PARENTHESIZED LATIN SMALL LETTER E

U+24A1 ( ⒡ ) PARENTHESIZED LATIN SMALL LETTER F

U+24A2 ( ⒢ ) PARENTHESIZED LATIN SMALL LETTER G

U+24A3 ( ⒣ ) PARENTHESIZED LATIN SMALL LETTER H

U+24A4 ( ⒤ ) PARENTHESIZED LATIN SMALL LETTER I

U+24A5 ( ⒥ ) PARENTHESIZED LATIN SMALL LETTER J

U+24A6 ( ⒦ ) PARENTHESIZED LATIN SMALL LETTER K

U+24A7 ( ⒧ ) PARENTHESIZED LATIN SMALL LETTER L

U+24A8 ( ⒨ ) PARENTHESIZED LATIN SMALL LETTER M

U+24A9 ( ⒩ ) PARENTHESIZED LATIN SMALL LETTER N

U+24AA ( ⒪ ) PARENTHESIZED LATIN SMALL LETTER O

U+24AB ( ⒫ ) PARENTHESIZED LATIN SMALL LETTER P

U+24AC ( ⒬ ) PARENTHESIZED LATIN SMALL LETTER Q

U+24AD ( ⒭ ) PARENTHESIZED LATIN SMALL LETTER R

U+24AE ( ⒮ ) PARENTHESIZED LATIN SMALL LETTER S

U+24AF ( ⒯ ) PARENTHESIZED LATIN SMALL LETTER T

U+24B0 ( ⒰ ) PARENTHESIZED LATIN SMALL LETTER U

U+24B1 ( ⒱ ) PARENTHESIZED LATIN SMALL LETTER V

U+24B2 ( ⒲ ) PARENTHESIZED LATIN SMALL LETTER W

U+24B3 ( ⒳ ) PARENTHESIZED LATIN SMALL LETTER X

U+24B4 ( ⒴ ) PARENTHESIZED LATIN SMALL LETTER Y

U+24B5 ( ⒵ ) PARENTHESIZED LATIN SMALL LETTER Z

Enclosed Alphanumerics - Circled Latin letters

U+24B6 ( Ⓐ ) CIRCLED LATIN CAPITAL LETTER A

U+24B7 ( Ⓑ ) CIRCLED LATIN CAPITAL LETTER B

U+24B8 ( Ⓒ ) CIRCLED LATIN CAPITAL LETTER C

U+24B9 ( Ⓓ ) CIRCLED LATIN CAPITAL LETTER D

U+24BA ( Ⓔ ) CIRCLED LATIN CAPITAL LETTER E

U+24BB ( Ⓕ ) CIRCLED LATIN CAPITAL LETTER F

U+24BC ( Ⓖ ) CIRCLED LATIN CAPITAL LETTER G

U+24BD ( Ⓗ ) CIRCLED LATIN CAPITAL LETTER H

U+24BE ( Ⓘ ) CIRCLED LATIN CAPITAL LETTER I

U+24BF ( Ⓙ ) CIRCLED LATIN CAPITAL LETTER J

U+24C0 ( Ⓚ ) CIRCLED LATIN CAPITAL LETTER K

U+24C1 ( Ⓛ ) CIRCLED LATIN CAPITAL LETTER L

U+24C2 ( Ⓜ ) CIRCLED LATIN CAPITAL LETTER M

U+24C3 ( Ⓝ ) CIRCLED LATIN CAPITAL LETTER N

U+24C4 ( Ⓞ ) CIRCLED LATIN CAPITAL LETTER O

U+24C5 ( Ⓟ ) CIRCLED LATIN CAPITAL LETTER P

U+24C6 ( Ⓠ ) CIRCLED LATIN CAPITAL LETTER Q

U+24C7 ( Ⓡ ) CIRCLED LATIN CAPITAL LETTER R

U+24C8 ( Ⓢ ) CIRCLED LATIN CAPITAL LETTER S

U+24C9 ( Ⓣ ) CIRCLED LATIN CAPITAL LETTER T

U+24CA ( Ⓤ ) CIRCLED LATIN CAPITAL LETTER U

U+24CB ( Ⓥ ) CIRCLED LATIN CAPITAL LETTER V

U+24CC ( Ⓦ ) CIRCLED LATIN CAPITAL LETTER W

U+24CD ( Ⓧ ) CIRCLED LATIN CAPITAL LETTER X

U+24CE ( Ⓨ ) CIRCLED LATIN CAPITAL LETTER Y

U+24CF ( Ⓩ ) CIRCLED LATIN CAPITAL LETTER Z

U+24D0 ( ⓐ ) CIRCLED LATIN SMALL LETTER A

U+24D1 ( ⓑ ) CIRCLED LATIN SMALL LETTER B

U+24D2 ( ⓒ ) CIRCLED LATIN SMALL LETTER C

U+24D3 ( ⓓ ) CIRCLED LATIN SMALL LETTER D

U+24D4 ( ⓔ ) CIRCLED LATIN SMALL LETTER E

U+24D5 ( ⓕ ) CIRCLED LATIN SMALL LETTER F

U+24D6 ( ⓖ ) CIRCLED LATIN SMALL LETTER G

U+24D7 ( ⓗ ) CIRCLED LATIN SMALL LETTER H

U+24D8 ( ⓘ ) CIRCLED LATIN SMALL LETTER I

U+24D9 ( ⓙ ) CIRCLED LATIN SMALL LETTER J

U+24DA ( ⓚ ) CIRCLED LATIN SMALL LETTER K

U+24DB ( ⓛ ) CIRCLED LATIN SMALL LETTER L

U+24DC ( ⓜ ) CIRCLED LATIN SMALL LETTER M

U+24DD ( ⓝ ) CIRCLED LATIN SMALL LETTER N

U+24DE ( ⓞ ) CIRCLED LATIN SMALL LETTER O

U+24DF ( ⓟ ) CIRCLED LATIN SMALL LETTER P

U+24E0 ( ⓠ ) CIRCLED LATIN SMALL LETTER Q

U+24E1 ( ⓡ ) CIRCLED LATIN SMALL LETTER R

U+24E2 ( ⓢ ) CIRCLED LATIN SMALL LETTER S

U+24E3 ( ⓣ ) CIRCLED LATIN SMALL LETTER T

U+24E4 ( ⓤ ) CIRCLED LATIN SMALL LETTER U

U+24E5 ( ⓥ ) CIRCLED LATIN SMALL LETTER V

U+24E6 ( ⓦ ) CIRCLED LATIN SMALL LETTER W

U+24E7 ( ⓧ ) CIRCLED LATIN SMALL LETTER X

U+24E8 ( ⓨ ) CIRCLED LATIN SMALL LETTER Y

U+24E9 ( ⓩ ) CIRCLED LATIN SMALL LETTER Z

Enclosed CJK Letters And Months - Squared Latin abbreviation

U+3250 ( ㉐ ) PARTNERSHIP SIGN

Enclosed CJK Letters And Months - Squared Latin abbreviations

U+32CC ( ㋌ ) SQUARE HG

U+32CD ( ㋍ ) SQUARE ERG

U+32CE ( ㋎ ) SQUARE EV

U+32CF ( ㋏ ) LIMITED LIABILITY SIGN

CJK Compatibility - Squared Latin abbreviations

U+3371 ( ㍱ ) SQUARE HPA

U+3372 ( ㍲ ) SQUARE DA

U+3373 ( ㍳ ) SQUARE AU

U+3374 ( ㍴ ) SQUARE BAR

U+3375 ( ㍵ ) SQUARE OV

U+3376 ( ㍶ ) SQUARE PC

U+3377 ( ㍷ ) SQUARE DM

U+3378 ( ㍸ ) SQUARE DM SQUARED

U+3379 ( ㍹ ) SQUARE DM CUBED

U+337A ( ㍺ ) SQUARE IU

U+3380 ( ㎀ ) SQUARE PA AMPS

U+3381 ( ㎁ ) SQUARE NA

U+3382 ( ㎂ ) SQUARE MU A

U+3383 ( ㎃ ) SQUARE MA

U+3384 ( ㎄ ) SQUARE KA

U+3385 ( ㎅ ) SQUARE KB

U+3386 ( ㎆ ) SQUARE MB

U+3387 ( ㎇ ) SQUARE GB

U+3388 ( ㎈ ) SQUARE CAL

U+3389 ( ㎉ ) SQUARE KCAL

U+338A ( ㎊ ) SQUARE PF

U+338B ( ㎋ ) SQUARE NF

U+338C ( ㎌ ) SQUARE MU F

U+338D ( ㎍ ) SQUARE MU G

U+338E ( ㎎ ) SQUARE MG

U+338F ( ㎏ ) SQUARE KG

U+3390 ( ㎐ ) SQUARE HZ

U+3391 ( ㎑ ) SQUARE KHZ

U+3392 ( ㎒ ) SQUARE MHZ

U+3393 ( ㎓ ) SQUARE GHZ

U+3394 ( ㎔ ) SQUARE THZ

U+3395 ( ㎕ ) SQUARE MU L

U+3396 ( ㎖ ) SQUARE ML

U+3397 ( ㎗ ) SQUARE DL

U+3398 ( ㎘ ) SQUARE KL

U+3399 ( ㎙ ) SQUARE FM

U+339A ( ㎚ ) SQUARE NM

U+339B ( ㎛ ) SQUARE MU M

U+339C ( ㎜ ) SQUARE MM

U+339D ( ㎝ ) SQUARE CM

U+339E ( ㎞ ) SQUARE KM

U+339F ( ㎟ ) SQUARE MM SQUARED

U+33A0 ( ㎠ ) SQUARE CM SQUARED

U+33A1 ( ㎡ ) SQUARE M SQUARED

U+33A2 ( ㎢ ) SQUARE KM SQUARED

U+33A3 ( ㎣ ) SQUARE MM CUBED

U+33A4 ( ㎤ ) SQUARE CM CUBED

U+33A5 ( ㎥ ) SQUARE M CUBED

U+33A6 ( ㎦ ) SQUARE KM CUBED

U+33A7 ( ㎧ ) SQUARE M OVER S

U+33A8 ( ㎨ ) SQUARE M OVER S SQUARED

U+33A9 ( ㎩ ) SQUARE PA

U+33AA ( ㎪ ) SQUARE KPA

U+33AB ( ㎫ ) SQUARE MPA

U+33AC ( ㎬ ) SQUARE GPA

U+33AD ( ㎭ ) SQUARE RAD

U+33AE ( ㎮ ) SQUARE RAD OVER S

U+33AF ( ㎯ ) SQUARE RAD OVER S SQUARED

U+33B0 ( ㎰ ) SQUARE PS

U+33B1 ( ㎱ ) SQUARE NS

U+33B2 ( ㎲ ) SQUARE MU S

U+33B3 ( ㎳ ) SQUARE MS

U+33B4 ( ㎴ ) SQUARE PV

U+33B5 ( ㎵ ) SQUARE NV

U+33B6 ( ㎶ ) SQUARE MU V

U+33B7 ( ㎷ ) SQUARE MV

U+33B8 ( ㎸ ) SQUARE KV

U+33B9 ( ㎹ ) SQUARE MV MEGA

U+33BA ( ㎺ ) SQUARE PW

U+33BB ( ㎻ ) SQUARE NW

U+33BC ( ㎼ ) SQUARE MU W

U+33BD ( ㎽ ) SQUARE MW

U+33BE ( ㎾ ) SQUARE KW

U+33BF ( ㎿ ) SQUARE MW MEGA

U+33C0 ( ㏀ ) SQUARE K OHM

U+33C1 ( ㏁ ) SQUARE M OHM

U+33C2 ( ㏂ ) SQUARE AM

U+33C3 ( ㏃ ) SQUARE BQ

U+33C4 ( ㏄ ) SQUARE CC

U+33C5 ( ㏅ ) SQUARE CD

U+33C6 ( ㏆ ) SQUARE C OVER KG

U+33C7 ( ㏇ ) SQUARE CO

U+33C8 ( ㏈ ) SQUARE DB

U+33C9 ( ㏉ ) SQUARE GY

U+33CA ( ㏊ ) SQUARE HA

U+33CB ( ㏋ ) SQUARE HP

U+33CC ( ㏌ ) SQUARE IN

U+33CD ( ㏍ ) SQUARE KK

U+33CE ( ㏎ ) SQUARE KM CAPITAL

U+33CF ( ㏏ ) SQUARE KT

U+33D0 ( ㏐ ) SQUARE LM

U+33D1 ( ㏑ ) SQUARE LN

U+33D2 ( ㏒ ) SQUARE LOG

U+33D3 ( ㏓ ) SQUARE LX

U+33D4 ( ㏔ ) SQUARE MB SMALL

U+33D5 ( ㏕ ) SQUARE MIL

U+33D6 ( ㏖ ) SQUARE MOL

U+33D7 ( ㏗ ) SQUARE PH

U+33D8 ( ㏘ ) SQUARE PM

U+33D9 ( ㏙ ) SQUARE PPM

U+33DA ( ㏚ ) SQUARE PR

U+33DB ( ㏛ ) SQUARE SR

U+33DC ( ㏜ ) SQUARE SV

U+33DD ( ㏝ ) SQUARE WB

U+33DE ( ㏞ ) SQUARE V OVER M

U+33DF ( ㏟ ) SQUARE A OVER M

CJK Compatibility - Squared Latin abbreviation

U+33FF ( ㏿ ) SQUARE GAL



@Latin, Cyrillic

The following is also used in Cyrillic

Spacing Modifier Letters - Miscellaneous phonetic modifiers

U+02BC ( ʼ ) MODIFIER LETTER APOSTROPHE

@Latin

While the following have the form of Greek or Cyrillic letters, they are functionally Latin/Phonetic, which should be noted.

Phonetic Extensions - Greek letters

U+1D26 ( ᴦ ) GREEK LETTER SMALL CAPITAL GAMMA

U+1D27 ( ᴧ ) GREEK LETTER SMALL CAPITAL LAMDA

U+1D28 ( ᴨ ) GREEK LETTER SMALL CAPITAL PI

U+1D29 ( ᴩ ) GREEK LETTER SMALL CAPITAL RHO

U+1D2A ( ᴪ ) GREEK LETTER SMALL CAPITAL PSI

Phonetic Extensions - Cyrillic letter

U+1D2B ( ᴫ ) CYRILLIC LETTER SMALL CAPITAL EL

Phonetic Extensions - Greek superscript modifier letters

U+1D5D ( ᵝ ) MODIFIER LETTER SMALL BETA

U+1D5E ( ᵞ ) MODIFIER LETTER SMALL GREEK GAMMA

U+1D5F ( ᵟ ) MODIFIER LETTER SMALL DELTA

U+1D60 ( ᵠ ) MODIFIER LETTER SMALL GREEK PHI

U+1D61 ( ᵡ ) MODIFIER LETTER SMALL CHI

Phonetic Extensions - Greek subscript modifier letters

U+1D66 ( ᵦ ) GREEK SUBSCRIPT SMALL LETTER BETA

U+1D67 ( ᵧ ) GREEK SUBSCRIPT SMALL LETTER GAMMA

U+1D68 ( ᵨ ) GREEK SUBSCRIPT SMALL LETTER RHO

U+1D69 ( ᵩ ) GREEK SUBSCRIPT SMALL LETTER PHI

U+1D6A ( ᵪ ) GREEK SUBSCRIPT SMALL LETTER CHI

Phonetic Extensions - Caucasian linguistics

U+1D78 ( ᵸ ) MODIFIER LETTER CYRILLIC EN

Phonetic Extensions Supplement - Modifier letters

U+1DBF ( ᶿ ) MODIFIER LETTER SMALL THETA

@Greek



These appear to have no explicit script just because they map to general punctuation marks or modifier letters.

Greek And Coptic - Numeral signs

U+0374 ( ʹ ) GREEK NUMERAL SIGN

Greek And Coptic - Punctuation

U+037E ( ; ) GREEK QUESTION MARK

Greek And Coptic - Spacing accent marks

U+0385 ( ΅ ) GREEK DIALYTIKA TONOS

Greek And Coptic - Punctuation

U+0387 ( · ) GREEK ANO TELEIA



In contrast, the following does have an explicit script, and is the only Sk (Modifier_Symbol) that does. It is also odd because it is Sk, while the corresponding U+0374 is a Modifier_Letter.



U+0375 ( ͵ ) GREEK LOWER NUMERAL SIGN



Latin 1 Supplement - Latin-1 punctuation and symbols

U+00B5 ( µ ) MICRO SIGN

@Armenian, Georgian

Armenian - Punctuation

U+0589 ( ։ ) ARMENIAN FULL STOP

@Arabic

Arabic - Subtending marks

U+0600 ( ؀ ) ARABIC NUMBER SIGN

U+0601 ( ؁ ) ARABIC SIGN SANAH

U+0602 ( ؂ ) ARABIC FOOTNOTE MARKER

U+0603 ( ؃ ) ARABIC SIGN SAFHA

Arabic Presentation Forms A - Symbol

U+FDFD ( ﷽ ) ARABIC LIGATURE BISMILLAH AR-RAHMAN AR-RAHEEM

@Arabic, Thaana

Arabic - Arabic-Indic digits

(Note that the U+06Fx EXTENDED ARABIC-INDIC DIGIT x characters have already the specific script Arabic)

U+0660 ( ٠ ) ARABIC-INDIC DIGIT ZERO

U+0661 ( ١ ) ARABIC-INDIC DIGIT ONE

U+0662 ( ٢ ) ARABIC-INDIC DIGIT TWO

U+0663 ( ٣ ) ARABIC-INDIC DIGIT THREE

U+0664 ( ٤ ) ARABIC-INDIC DIGIT FOUR

U+0665 ( ٥ ) ARABIC-INDIC DIGIT FIVE

U+0666 ( ٦ ) ARABIC-INDIC DIGIT SIX

U+0667 ( ٧ ) ARABIC-INDIC DIGIT SEVEN

U+0668 ( ٨ ) ARABIC-INDIC DIGIT EIGHT

U+0669 ( ٩ ) ARABIC-INDIC DIGIT NINE

@Arabic, Syriac, Thaana

Arabic - Punctuation

U+060C ( ، ) ARABIC COMMA

U+061B ( ‎؛‎ ) ARABIC SEMICOLON

U+061F ( ‎؟‎ ) ARABIC QUESTION MARK

@Common

Arabic - Koranic annotation signs

U+06DD ( ۝ ) ARABIC END OF AYAH

@Arabic, Syriac

Are there any others of these that are not used with Syriac?

Arabic - Based on ISO 8859-6

U+0640 ( ‎ـ‎ ) ARABIC TATWEEL

Arabic - Points from ISO 8859-6

U+064B ( ً ) ARABIC FATHATAN

U+064C ( ٌ ) ARABIC DAMMATAN

U+064D ( ٍ ) ARABIC KASRATAN

U+064E ( َ ) ARABIC FATHA

U+064F ( ُ ) ARABIC DAMMA

U+0650 ( ِ ) ARABIC KASRA

U+0651 ( ّ ) ARABIC SHADDA

U+0652 ( ْ ) ARABIC SUKUN

Arabic - Combining maddah and hamza

U+0653 ( ٓ ) ARABIC MADDAH ABOVE

U+0654 ( ٔ ) ARABIC HAMZA ABOVE

U+0655 ( ٕ ) ARABIC HAMZA BELOW

Arabic - Point

U+0670 ( ٰ ) ARABIC LETTER SUPERSCRIPT ALEF

@Bopomofo

These appear to be just Bopomofo script

Spacing Modifier Letters - Extended Bopomofo tone marks

U+02EA ( ˪ ) MODIFIER LETTER YIN DEPARTING TONE MARK

U+02EB ( ˫ ) MODIFIER LETTER YANG DEPARTING TONE MARK

@Devanagari

Am guessing these are Devanagari script.



Devanagari - Various signs

U+0951 ( ॑ ) DEVANAGARI STRESS SIGN UDATTA

U+0952 ( ॒ ) DEVANAGARI STRESS SIGN ANUDATTA

Devanagari - Devanagari-specific additions

U+0970 ( ॰ ) DEVANAGARI ABBREVIATION SIGN

@Devanagari, Bengali, Gurmukhi, Oriya

The annotations say "scripts of India". However, Cibu reports that the dandas are not used with Malayalam, Kannada, Telugu, Tamil and Gujarati, and presumably these are not used with Urdu, etc.

Devanagari - Generic punctuation for scripts of India

U+0964 ( । ) DEVANAGARI DANDA

U+0965 ( ॥ ) DEVANAGARI DOUBLE DANDA

@Devanagari, Bengali, Gurmukhi, Oriya, Gujarati, Tamil, Telugu, Kannada, Malayalam

The annotations say "The Vedic signs for jihvamuliya and upadhmaniya were encoded in the Kannada block, but are intended for general Vedic use with all scripts", that probably means "with all Brahmi-based Indic scripts".

Kannada - Vedic signs

U+0CF1 ( ೱ ) KANNADA SIGN JIHVAMULIYA

U+0CF2 ( ೲ ) KANNADA SIGN UPADHMANIYA



@Georgian

Note: historic, Latin, Cyrillic, Greek, Coptic

Georgian - Punctuation

U+10FB ( ჻ ) GEORGIAN PARAGRAPH SEPARATOR

@Runic

Any archaic script like Runic (Get current list from Ken)

Runic - Punctuation

U+16EB ( ᛫ ) RUNIC SINGLE PUNCTUATION

U+16EC ( ᛬ ) RUNIC MULTIPLE PUNCTUATION

U+16ED ( ᛭ ) RUNIC CROSS PUNCTUATION

@Hanunoo, Tagalog, Buhid, Tagbanwa

Don't know exactly what "Philippine scripts" is supposed to be; am guessing the above.

Hanunoo - Generic punctuation for Philippine scripts

U+1735 ( ᜵ ) PHILIPPINE SINGLE PUNCTUATION

U+1736 ( ᜶ ) PHILIPPINE DOUBLE PUNCTUATION

@Mongolian, Phags-Pa

Mongolian - Punctuation

U+1802 ( ᠂ ) MONGOLIAN COMMA

U+1803 ( ᠃ ) MONGOLIAN FULL STOP

U+1805 ( ᠅ ) MONGOLIAN FOUR DOTS

@Hiragana, Katakana

Think these are pretty clearly just Hiragana and Katakana.



CJK Symbols And Punctuation - Other CJK symbols

U+3031 ( 〱 ) VERTICAL KANA REPEAT MARK

U+3032 ( 〲 ) VERTICAL KANA REPEAT WITH VOICED SOUND MARK

U+3033 ( 〳 ) VERTICAL KANA REPEAT MARK UPPER HALF

U+3034 ( 〴 ) VERTICAL KANA REPEAT WITH VOICED SOUND MARK UPPER HALF

U+3035 ( 〵 ) VERTICAL KANA REPEAT MARK LOWER HALF

Hiragana - Voicing marks

U+3099 ( ゙ ) COMBINING KATAKANA-HIRAGANA VOICED SOUND MARK

U+309A ( ゚ ) COMBINING KATAKANA-HIRAGANA SEMI-VOICED SOUND MARK

U+309B ( ゛ ) KATAKANA-HIRAGANA VOICED SOUND MARK

U+309C ( ゜ ) KATAKANA-HIRAGANA SEMI-VOICED SOUND MARK

Katakana - Katakana punctuation

U+30A0 ( ゠ ) KATAKANA-HIRAGANA DOUBLE HYPHEN

Katakana - Conjunction and length marks

U+30FB ( ・ ) KATAKANA MIDDLE DOT

U+30FC ( ー ) KATAKANA-HIRAGANA PROLONGED SOUND MARK

Halfwidth And Fullwidth Forms - Halfwidth Katakana variants

U+FF65 ( ･ ) HALFWIDTH KATAKANA MIDDLE DOT

U+FF70 ( ｰ ) HALFWIDTH KATAKANA-HIRAGANA PROLONGED SOUND MARK

U+FF9E ( ﾞ ) HALFWIDTH KATAKANA VOICED SOUND MARK

U+FF9F ( ﾟ ) HALFWIDTH KATAKANA SEMI-VOICED SOUND MARK



[:Block=Kanbun:]

// used with the Japanese Writing system - change to Jpan



@Hangul

CJK Symbols And Punctuation - Diacritics

U+302E ( 〮 ) HANGUL SINGLE DOT TONE MARK

U+302F ( 〯 ) HANGUL DOUBLE DOT TONE MARK



General comment, use the special codes for writing systems: Jpan, Kore,...

@Han

Later add Tangut, Jurchen, Khitan, once encoded.

[:Block=Ideographic_Description_Characters:]

CJK Strokes - CJK strokes

U+31C0 ( ㇀ ) CJK STROKE T

U+31C1 ( ㇁ ) CJK STROKE WG

U+31C2 ( ㇂ ) CJK STROKE XG

U+31C3 ( ㇃ ) CJK STROKE BXG

U+31C4 ( ㇄ ) CJK STROKE SW

U+31C5 ( ㇅ ) CJK STROKE HZZ

U+31C6 ( ㇆ ) CJK STROKE HZG

U+31C7 ( ㇇ ) CJK STROKE HP

U+31C8 ( ㇈ ) CJK STROKE HZWG

U+31C9 ( ㇉ ) CJK STROKE SZWG

U+31CA ( ㇊ ) CJK STROKE HZT

U+31CB ( ㇋ ) CJK STROKE HZZP

U+31CC ( ㇌ ) CJK STROKE HPWG

U+31CD ( ㇍ ) CJK STROKE HZW

U+31CE ( ㇎ ) CJK STROKE HZZZ

U+31CF ( ㇏ ) CJK STROKE N

U+31D0 ( ㇐ ) CJK STROKE H

U+31D1 ( ㇑ ) CJK STROKE S

U+31D2 ( ㇒ ) CJK STROKE P

U+31D3 ( ㇓ ) CJK STROKE SP

U+31D4 ( ㇔ ) CJK STROKE D

U+31D5 ( ㇕ ) CJK STROKE HZ

U+31D6 ( ㇖ ) CJK STROKE HG

U+31D7 ( ㇗ ) CJK STROKE SZ

U+31D8 ( ㇘ ) CJK STROKE SWZ

U+31D9 ( ㇙ ) CJK STROKE ST

U+31DA ( ㇚ ) CJK STROKE SG

U+31DB ( ㇛ ) CJK STROKE PD

U+31DC ( ㇜ ) CJK STROKE PZ

U+31DD ( ㇝ ) CJK STROKE TN

U+31DE ( ㇞ ) CJK STROKE SZZ

U+31DF ( ㇟ ) CJK STROKE SWG

U+31E0 ( ㇠ ) CJK STROKE HXWG

U+31E1 ( ㇡ ) CJK STROKE HZZZG

U+31E2 ( ㇢ ) CJK STROKE PG

U+31E3 ( ㇣ ) CJK STROKE Q

Enclosed CJK Letters And Months - Parenthesized ideographs

U+3220 ( ㈠ ) PARENTHESIZED IDEOGRAPH ONE

U+3221 ( ㈡ ) PARENTHESIZED IDEOGRAPH TWO

U+3222 ( ㈢ ) PARENTHESIZED IDEOGRAPH THREE

U+3223 ( ㈣ ) PARENTHESIZED IDEOGRAPH FOUR

U+3224 ( ㈤ ) PARENTHESIZED IDEOGRAPH FIVE

U+3225 ( ㈥ ) PARENTHESIZED IDEOGRAPH SIX

U+3226 ( ㈦ ) PARENTHESIZED IDEOGRAPH SEVEN

U+3227 ( ㈧ ) PARENTHESIZED IDEOGRAPH EIGHT

U+3228 ( ㈨ ) PARENTHESIZED IDEOGRAPH NINE

U+3229 ( ㈩ ) PARENTHESIZED IDEOGRAPH TEN

U+322A ( ㈪ ) PARENTHESIZED IDEOGRAPH MOON

U+322B ( ㈫ ) PARENTHESIZED IDEOGRAPH FIRE

U+322C ( ㈬ ) PARENTHESIZED IDEOGRAPH WATER

U+322D ( ㈭ ) PARENTHESIZED IDEOGRAPH WOOD

U+322E ( ㈮ ) PARENTHESIZED IDEOGRAPH METAL

U+322F ( ㈯ ) PARENTHESIZED IDEOGRAPH EARTH

U+3230 ( ㈰ ) PARENTHESIZED IDEOGRAPH SUN

U+3231 ( ㈱ ) PARENTHESIZED IDEOGRAPH STOCK

U+3232 ( ㈲ ) PARENTHESIZED IDEOGRAPH HAVE

U+3233 ( ㈳ ) PARENTHESIZED IDEOGRAPH SOCIETY

U+3234 ( ㈴ ) PARENTHESIZED IDEOGRAPH NAME

U+3235 ( ㈵ ) PARENTHESIZED IDEOGRAPH SPECIAL

U+3236 ( ㈶ ) PARENTHESIZED IDEOGRAPH FINANCIAL

U+3237 ( ㈷ ) PARENTHESIZED IDEOGRAPH CONGRATULATION

U+3238 ( ㈸ ) PARENTHESIZED IDEOGRAPH LABOR

U+3239 ( ㈹ ) PARENTHESIZED IDEOGRAPH REPRESENT

U+323A ( ㈺ ) PARENTHESIZED IDEOGRAPH CALL

U+323B ( ㈻ ) PARENTHESIZED IDEOGRAPH STUDY

U+323C ( ㈼ ) PARENTHESIZED IDEOGRAPH SUPERVISE

U+323D ( ㈽ ) PARENTHESIZED IDEOGRAPH ENTERPRISE

U+323E ( ㈾ ) PARENTHESIZED IDEOGRAPH RESOURCE

U+323F ( ㈿ ) PARENTHESIZED IDEOGRAPH ALLIANCE

U+3240 ( ㉀ ) PARENTHESIZED IDEOGRAPH FESTIVAL

U+3241 ( ㉁ ) PARENTHESIZED IDEOGRAPH REST

U+3242 ( ㉂ ) PARENTHESIZED IDEOGRAPH SELF

U+3243 ( ㉃ ) PARENTHESIZED IDEOGRAPH REACH

Enclosed CJK Letters And Months - Circled ideographs

U+3280 ( ㊀ ) CIRCLED IDEOGRAPH ONE

U+3281 ( ㊁ ) CIRCLED IDEOGRAPH TWO

U+3282 ( ㊂ ) CIRCLED IDEOGRAPH THREE

U+3283 ( ㊃ ) CIRCLED IDEOGRAPH FOUR

U+3284 ( ㊄ ) CIRCLED IDEOGRAPH FIVE

U+3285 ( ㊅ ) CIRCLED IDEOGRAPH SIX

U+3286 ( ㊆ ) CIRCLED IDEOGRAPH SEVEN

U+3287 ( ㊇ ) CIRCLED IDEOGRAPH EIGHT

U+3288 ( ㊈ ) CIRCLED IDEOGRAPH NINE

U+3289 ( ㊉ ) CIRCLED IDEOGRAPH TEN

U+328A ( ㊊ ) CIRCLED IDEOGRAPH MOON

U+328B ( ㊋ ) CIRCLED IDEOGRAPH FIRE

U+328C ( ㊌ ) CIRCLED IDEOGRAPH WATER

U+328D ( ㊍ ) CIRCLED IDEOGRAPH WOOD

U+328E ( ㊎ ) CIRCLED IDEOGRAPH METAL

U+328F ( ㊏ ) CIRCLED IDEOGRAPH EARTH

U+3290 ( ㊐ ) CIRCLED IDEOGRAPH SUN

U+3291 ( ㊑ ) CIRCLED IDEOGRAPH STOCK

U+3292 ( ㊒ ) CIRCLED IDEOGRAPH HAVE

U+3293 ( ㊓ ) CIRCLED IDEOGRAPH SOCIETY

U+3294 ( ㊔ ) CIRCLED IDEOGRAPH NAME

U+3295 ( ㊕ ) CIRCLED IDEOGRAPH SPECIAL

U+3296 ( ㊖ ) CIRCLED IDEOGRAPH FINANCIAL

U+3297 ( ㊗ ) CIRCLED IDEOGRAPH CONGRATULATION

U+3298 ( ㊘ ) CIRCLED IDEOGRAPH LABOR

U+3299 ( ㊙ ) CIRCLED IDEOGRAPH SECRET

U+329A ( ㊚ ) CIRCLED IDEOGRAPH MALE

U+329B ( ㊛ ) CIRCLED IDEOGRAPH FEMALE

U+329C ( ㊜ ) CIRCLED IDEOGRAPH SUITABLE

U+329D ( ㊝ ) CIRCLED IDEOGRAPH EXCELLENT

U+329E ( ㊞ ) CIRCLED IDEOGRAPH PRINT

U+329F ( ㊟ ) CIRCLED IDEOGRAPH ATTENTION

U+32A0 ( ㊠ ) CIRCLED IDEOGRAPH ITEM

U+32A1 ( ㊡ ) CIRCLED IDEOGRAPH REST

U+32A2 ( ㊢ ) CIRCLED IDEOGRAPH COPY

U+32A3 ( ㊣ ) CIRCLED IDEOGRAPH CORRECT

U+32A4 ( ㊤ ) CIRCLED IDEOGRAPH HIGH

U+32A5 ( ㊥ ) CIRCLED IDEOGRAPH CENTRE

U+32A6 ( ㊦ ) CIRCLED IDEOGRAPH LOW

U+32A7 ( ㊧ ) CIRCLED IDEOGRAPH LEFT

U+32A8 ( ㊨ ) CIRCLED IDEOGRAPH RIGHT

U+32A9 ( ㊩ ) CIRCLED IDEOGRAPH MEDICINE

U+32AA ( ㊪ ) CIRCLED IDEOGRAPH RELIGION

U+32AB ( ㊫ ) CIRCLED IDEOGRAPH STUDY

U+32AC ( ㊬ ) CIRCLED IDEOGRAPH SUPERVISE

U+32AD ( ㊭ ) CIRCLED IDEOGRAPH ENTERPRISE

U+32AE ( ㊮ ) CIRCLED IDEOGRAPH RESOURCE

U+32AF ( ㊯ ) CIRCLED IDEOGRAPH ALLIANCE

U+32B0 ( ㊰ ) CIRCLED IDEOGRAPH NIGHT

Enclosed CJK Letters And Months - Telegraph symbols for months

U+32C0 ( ㋀ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR JANUARY

U+32C1 ( ㋁ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR FEBRUARY

U+32C2 ( ㋂ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR MARCH

U+32C3 ( ㋃ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR APRIL

U+32C4 ( ㋄ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR MAY

U+32C5 ( ㋅ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR JUNE

U+32C6 ( ㋆ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR JULY

U+32C7 ( ㋇ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR AUGUST

U+32C8 ( ㋈ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR SEPTEMBER

U+32C9 ( ㋉ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR OCTOBER

U+32CA ( ㋊ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR NOVEMBER

U+32CB ( ㋋ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DECEMBER

CJK Compatibility - Telegraph symbols for hours

U+3358 ( ㍘ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR ZERO

U+3359 ( ㍙ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR ONE

U+335A ( ㍚ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR TWO

U+335B ( ㍛ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR THREE

U+335C ( ㍜ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR FOUR

U+335D ( ㍝ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR FIVE

U+335E ( ㍞ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR SIX

U+335F ( ㍟ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR SEVEN

U+3360 ( ㍠ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR EIGHT

U+3361 ( ㍡ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR NINE

U+3362 ( ㍢ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR TEN

U+3363 ( ㍣ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR ELEVEN

U+3364 ( ㍤ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR TWELVE

U+3365 ( ㍥ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR THIRTEEN

U+3366 ( ㍦ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR FOURTEEN

U+3367 ( ㍧ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR FIFTEEN

U+3368 ( ㍨ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR SIXTEEN

U+3369 ( ㍩ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR SEVENTEEN

U+336A ( ㍪ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR EIGHTEEN

U+336B ( ㍫ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR NINETEEN

U+336C ( ㍬ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR TWENTY

U+336D ( ㍭ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR TWENTY-ONE

U+336E ( ㍮ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR TWENTY-TWO

U+336F ( ㍯ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR TWENTY-THREE

U+3370 ( ㍰ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR TWENTY-FOUR

CJK Compatibility - Japanese era names

U+337B ( ㍻ ) SQUARE ERA NAME HEISEI

U+337C ( ㍼ ) SQUARE ERA NAME SYOUWA

U+337D ( ㍽ ) SQUARE ERA NAME TAISYOU

U+337E ( ㍾ ) SQUARE ERA NAME MEIZI

CJK Compatibility - Japanese corporation

U+337F ( ㍿ ) SQUARE CORPORATION

CJK Compatibility - Telegraph symbols for days

U+33E0 ( ㏠ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY ONE

U+33E1 ( ㏡ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY TWO

U+33E2 ( ㏢ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY THREE

U+33E3 ( ㏣ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY FOUR

U+33E4 ( ㏤ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY FIVE

U+33E5 ( ㏥ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY SIX

U+33E6 ( ㏦ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY SEVEN

U+33E7 ( ㏧ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY EIGHT

U+33E8 ( ㏨ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY NINE

U+33E9 ( ㏩ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY TEN

U+33EA ( ㏪ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY ELEVEN

U+33EB ( ㏫ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY TWELVE

U+33EC ( ㏬ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY THIRTEEN

U+33ED ( ㏭ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY FOURTEEN

U+33EE ( ㏮ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY FIFTEEN

U+33EF ( ㏯ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY SIXTEEN

U+33F0 ( ㏰ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY SEVENTEEN

U+33F1 ( ㏱ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY EIGHTEEN

U+33F2 ( ㏲ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY NINETEEN

U+33F3 ( ㏳ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY TWENTY

U+33F4 ( ㏴ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY TWENTY-ONE

U+33F5 ( ㏵ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY TWENTY-TWO

U+33F6 ( ㏶ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY TWENTY-THREE

U+33F7 ( ㏷ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY TWENTY-FOUR

U+33F8 ( ㏸ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY TWENTY-FIVE

U+33F9 ( ㏹ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY TWENTY-SIX

U+33FA ( ㏺ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY TWENTY-SEVEN

U+33FB ( ㏻ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY TWENTY-EIGHT

U+33FC ( ㏼ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY TWENTY-NINE

U+33FD ( ㏽ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY THIRTY

U+33FE ( ㏾ ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR DAY THIRTY-ONE

@Han, Hangul, Hiragana, Katakana, Bopomofo, Yi, Phags-pa, Tibetan

CJK Symbols And Punctuation - CJK symbols and punctuation

U+3001 ( 、 ) IDEOGRAPHIC COMMA

U+3002 ( 。 ) IDEOGRAPHIC FULL STOP

Halfwidth And Fullwidth Forms - Halfwidth CJK punctuation

U+FF61 ( ｡ ) HALFWIDTH IDEOGRAPHIC FULL STOP

U+FF64 ( ､ ) HALFWIDTH IDEOGRAPHIC COMMA

@Hiragana, Katakana



Jpan = Japanese (alias for Han + Hiragana + Katakana)

CJK Symbols And Punctuation - CJK symbols

U+3012 ( 〒 ) POSTAL MARK

CJK Symbols And Punctuation - Other CJK symbols

U+3036 ( 〶 ) CIRCLED POSTAL MARK



@Hangul



Kore = Korean (alias for Hangul + Han)

Enclosed CJK Letters And Months - Symbol

U+327F ( ㉿ ) KOREAN STANDARD SYMBOL



@Han, Hangul, Hiragana, Katakana, Bopomofo

For easier comparison, these are also broken down by General Category

General-Category=Punctuation

CJK Symbols And Punctuation - CJK symbols and punctuation

U+3001 ( 、 ) IDEOGRAPHIC COMMA

U+3002 ( 。 ) IDEOGRAPHIC FULL STOP

U+3003 ( 〃 ) DITTO MARK

U+301C ( 〜 ) WAVE DASH

U+301D ( 〝 ) REVERSED DOUBLE PRIME QUOTATION MARK

U+301E ( 〞 ) DOUBLE PRIME QUOTATION MARK

U+301F ( 〟 ) LOW DOUBLE PRIME QUOTATION MARK

CJK Symbols And Punctuation - Other CJK symbols

U+3030 ( 〰 ) WAVY DASH

CJK Symbols And Punctuation - Other CJK punctuation

U+303D ( 〽 ) PART ALTERNATION MARK

CJK Compatibility Forms - Sidelining emphasis marks

U+FE45 ( ﹅ ) SESAME DOT

U+FE46 ( ﹆ ) WHITE SESAME DOT



General-Category=Symbol

CJK Symbols And Punctuation - CJK symbols and punctuation

U+3004 ( 〄 ) JAPANESE INDUSTRIAL STANDARD SYMBOL

CJK Symbols And Punctuation - CJK symbols and punctuation

U+3020 ( 〠 ) POSTAL MARK FACE

CJK Symbols And Punctuation - CJK symbols

U+3013 ( 〓 ) GETA MARK

CJK Symbols And Punctuation - Other CJK symbols

U+3037 ( 〷 ) IDEOGRAPHIC TELEGRAPH LINE FEED SEPARATOR SYMBOL

CJK Symbols And Punctuation - Special CJK indicators

U+303E ( 〾 ) IDEOGRAPHIC VARIATION INDICATOR

U+303F ( 〿 ) IDEOGRAPHIC HALF FILL SPACE



General Category=Letter

CJK Symbols And Punctuation - CJK symbols and punctuation

U+3006 ( 〆 ) IDEOGRAPHIC CLOSING MARK

CJK Symbols And Punctuation - Other CJK punctuation

U+303C ( 〼 ) MASU MARK



@Han, Hangul, Hiragana, Katakana, Bopomofo, Yi, Phags-pa, Tibetan

Other scripts of China

CJK Symbols And Punctuation - CJK angle brackets

U+3008 ( 〈 ) LEFT ANGLE BRACKET

U+3009 ( 〉 ) RIGHT ANGLE BRACKET

U+300A ( 《 ) LEFT DOUBLE ANGLE BRACKET

U+300B ( 》 ) RIGHT DOUBLE ANGLE BRACKET

U+300C ( 「 ) LEFT CORNER BRACKET

U+300D ( 」 ) RIGHT CORNER BRACKET

U+300E ( 『 ) LEFT WHITE CORNER BRACKET

U+300F ( 』 ) RIGHT WHITE CORNER BRACKET

U+3010 ( 【 ) LEFT BLACK LENTICULAR BRACKET

U+3011 ( 】 ) RIGHT BLACK LENTICULAR BRACKET

U+3014 ( 〔 ) LEFT TORTOISE SHELL BRACKET

U+3015 ( 〕 ) RIGHT TORTOISE SHELL BRACKET

U+3016 ( 〖 ) LEFT WHITE LENTICULAR BRACKET

U+3017 ( 〗 ) RIGHT WHITE LENTICULAR BRACKET

U+3018 ( 〘 ) LEFT WHITE TORTOISE SHELL BRACKET

U+3019 ( 〙 ) RIGHT WHITE TORTOISE SHELL BRACKET

U+301A ( 〚 ) LEFT WHITE SQUARE BRACKET

U+301B ( 〛 ) RIGHT WHITE SQUARE BRACKET

Halfwidth And Fullwidth Forms - Halfwidth CJK punctuation

U+FF62 ( ｢ ) HALFWIDTH LEFT CORNER BRACKET

U+FF63 ( ｣ ) HALFWIDTH RIGHT CORNER BRACKET

@Han, Bopomofo

CJK Symbols And Punctuation - Diacritics

U+302A ( 〪 ) IDEOGRAPHIC LEVEL TONE MARK

U+302B ( 〫 ) IDEOGRAPHIC RISING TONE MARK

U+302C ( 〬 ) IDEOGRAPHIC DEPARTING TONE MARK

U+302D ( 〭 ) IDEOGRAPHIC ENTERING TONE MARK

Letterlike symbols

@No-Change

The following have specific scripts:

Arabic - Letterlike symbol

U+0608 ( ‎؈‎ ) ARABIC RAY

Letterlike Symbols - Letterlike symbols

U+2126 ( Ω ) OHM SIGN

U+212A ( K ) KELVIN SIGN

U+212B ( Å ) ANGSTROM SIGN

U+2132 ( Ⅎ ) TURNED CAPITAL F

Letterlike Symbols - Lowercase Claudian letter

U+214E ( ⅎ ) TURNED SMALL F



While the following -- including a number that are apparently similar -- do not. (Math characters removed). Guessing these should be treated like Latin



@Latin

Letterlike Symbols - Letterlike symbols

U+2100 ( ℀ ) ACCOUNT OF

U+2101 ( ℁ ) ADDRESSED TO THE SUBJECT

U+2103 ( ℃ ) DEGREE CELSIUS

U+2104 ( ℄ ) CENTRE LINE SYMBOL

U+2105 ( ℅ ) CARE OF

U+2106 ( ℆ ) CADA UNA

U+2107 ( ℇ ) EULER CONSTANT

U+2108 ( ℈ ) SCRUPLE

U+2109 ( ℉ ) DEGREE FAHRENHEIT

U+2114 ( ℔ ) L B BAR SYMBOL

U+2116 ( № ) NUMERO SIGN

U+2117 ( ℗ ) SOUND RECORDING COPYRIGHT

U+2118 ( ℘ ) SCRIPT CAPITAL P

U+211E ( ℞ ) PRESCRIPTION TAKE

U+211F ( ℟ ) RESPONSE

U+2120 ( ℠ ) SERVICE MARK

U+2121 ( ℡ ) TELEPHONE SIGN

U+2122 ( ™ ) TRADE MARK SIGN

U+2123 ( ℣ ) VERSICLE

U+2125 ( ℥ ) OUNCE SIGN

U+2127 ( ℧ ) INVERTED OHM SIGN

U+212E ( ℮ ) ESTIMATED SYMBOL

Letterlike Symbols - Additional letterlike symbols

U+2139 ( ℹ ) INFORMATION SOURCE

U+213A ( ℺ ) ROTATED CAPITAL Q

U+213B ( ℻ ) FACSIMILE SIGN

U+214A ( ⅊ ) PROPERTY LINE

U+214C ( ⅌ ) PER SIGN

U+214D ( ⅍ ) AKTIESELSKAB



Math Symbols with specific scripts

The following are the only Sm (Math_Symbol) with explicit scripts (856 GC=Sm characters don't, and 1,027 Math=true characters don't) or are Letterlike-symbols. The Arabic ones seem ok (if they are not used in Syriac, etc). Should the Greek one be Common script?

@Common

Greek And Coptic - Variant letterforms and symbols

U+03F6 ( ϶ ) GREEK REVERSED LUNATE EPSILON SYMBOL

@No-Change

Arabic - Radix symbols

U+0606 ( ؆ ) ARABIC-INDIC CUBE ROOT

U+0607 ( ؇ ) ARABIC-INDIC FOURTH ROOT

Arabic - Letterlike symbol

U+0608 ( ‎؈‎ ) ARABIC RAY



Letterlike Symbols - Double-struck large operator

U+2140 ( ⅀ ) DOUBLE-STRUCK N-ARY SUMMATION

Letterlike Symbols - Additional letterlike symbols

U+2141 ( ⅁ ) TURNED SANS-SERIF CAPITAL G

U+2142 ( ⅂ ) TURNED SANS-SERIF CAPITAL L

U+2143 ( ⅃ ) REVERSED SANS-SERIF CAPITAL L

U+2144 ( ⅄ ) TURNED SANS-SERIF CAPITAL Y

U+214B ( ⅋ ) TURNED AMPERSAND





Characters whose canonical equivalents don't match in script

[Ed note: see also actions 110-A092, and 113-A036, and L2/07-071 ]



The following characters are Script=Greek, but their canonical equivalents are Script=Common. These are the only such characters that change from an explicit script to Common.

Greek Extended - Precomposed polytonic Greek

U+1FC1 ( ῁ ) GREEK DIALYTIKA AND PERISPOMENI

U+1FED ( ῭ ) GREEK DIALYTIKA AND VARIA

U+1FEE ( ΅ ) GREEK DIALYTIKA AND OXIA

U+1FEF ( ` ) GREEK VARIA

U+1FFD ( ´ ) GREEK OXIA



This is not an issue for the other Modifier Symbols (Sk) in Greek blocks:

Greek And Coptic - Numeral signs

U+0375 ( ͵ ) GREEK LOWER NUMERAL SIGN // the only one without a compat decomp.

Greek And Coptic - Spacing accent marks

U+0384 ( ΄ ) GREEK TONOS

U+0385 ( ΅ ) GREEK DIALYTIKA TONOS

Greek Extended - Precomposed polytonic Greek

U+1FBD ( ᾽ ) GREEK KORONIS

U+1FBF ( ᾿ ) GREEK PSILI

U+1FC0 ( ῀ ) GREEK PERISPOMENI

U+1FCD ( ῍ ) GREEK PSILI AND VARIA

U+1FCE ( ῎ ) GREEK PSILI AND OXIA

U+1FCF ( ῏ ) GREEK PSILI AND PERISPOMENI

U+1FDD ( ῝ ) GREEK DASIA AND VARIA

U+1FDE ( ῞ ) GREEK DASIA AND OXIA

U+1FDF ( ῟ ) GREEK DASIA AND PERISPOMENI

U+1FFE ( ῾ ) GREEK DASIA



Or the other Modifier Letters (Lm) in Greek blocks:

Greek And Coptic - Numeral signs

U+0374 ( ʹ ) GREEK NUMERAL SIGN

Greek And Coptic - Iota subscript

U+037A ( ͺ ) GREEK YPOGEGRAMMENI



While not having to do with Script, I ran across the following also:

Mc (Combining Marks) without script

These would probably be better as Sk (Modifier_Symbol). We should at least document them as the only cases of Mc that are not letter-like.

@Sk

Musical Symbols - Stems

U+1D165 ( 𝅥 ) MUSICAL SYMBOL COMBINING STEM

U+1D166 ( 𝅦 ) MUSICAL SYMBOL COMBINING SPRECHGESANG STEM

Musical Symbols - Augmentation dot

U+1D16D ( 𝅭 ) MUSICAL SYMBOL COMBINING AUGMENTATION DOT

Musical Symbols - Flags

U+1D16E ( 𝅮 ) MUSICAL SYMBOL COMBINING FLAG-1

U+1D16F ( 𝅯 ) MUSICAL SYMBOL COMBINING FLAG-2

U+1D170 ( 𝅰 ) MUSICAL SYMBOL COMBINING FLAG-3

U+1D171 ( 𝅱 ) MUSICAL SYMBOL COMBINING FLAG-4

U+1D172 ( 𝅲 ) MUSICAL SYMBOL COMBINING FLAG-5

Math Symbols not marked with Sm

While Letter in form, these all should behave like Sm; they don't want to case fold, or be treated as parts of words.



@Sm

[:block=Mathematical Alphanumeric Symbols:]

// plus [[:subhead=/(?i)letterlike/:][:block=/(?i)letterlike/:]&[:math:]-[:sm:]]

Letterlike Symbols - Letterlike symbols

U+2102 ( ℂ ) DOUBLE-STRUCK CAPITAL C

U+210A ( ℊ ) SCRIPT SMALL G

U+210B ( ℋ ) SCRIPT CAPITAL H

U+210C ( ℌ ) BLACK-LETTER CAPITAL H

U+210D ( ℍ ) DOUBLE-STRUCK CAPITAL H

U+210E ( ℎ ) PLANCK CONSTANT

U+210F ( ℏ ) PLANCK CONSTANT OVER TWO PI

U+2110 ( ℐ ) SCRIPT CAPITAL I

U+2111 ( ℑ ) BLACK-LETTER CAPITAL I

U+2112 ( ℒ ) SCRIPT CAPITAL L

U+2113 ( ℓ ) SCRIPT SMALL L

U+2115 ( ℕ ) DOUBLE-STRUCK CAPITAL N

U+2119 ( ℙ ) DOUBLE-STRUCK CAPITAL P

U+211A ( ℚ ) DOUBLE-STRUCK CAPITAL Q

U+211B ( ℛ ) SCRIPT CAPITAL R

U+211C ( ℜ ) BLACK-LETTER CAPITAL R

U+211D ( ℝ ) DOUBLE-STRUCK CAPITAL R

U+2124 ( ℤ ) DOUBLE-STRUCK CAPITAL Z

U+2128 ( ℨ ) BLACK-LETTER CAPITAL Z

U+2129 ( ℩ ) TURNED GREEK SMALL LETTER IOTA

U+212C ( ℬ ) SCRIPT CAPITAL B

U+212D ( ℭ ) BLACK-LETTER CAPITAL C

U+212F ( ℯ ) SCRIPT SMALL E

U+2130 ( ℰ ) SCRIPT CAPITAL E

U+2131 ( ℱ ) SCRIPT CAPITAL F

U+2133 ( ℳ ) SCRIPT CAPITAL M

U+2134 ( ℴ ) SCRIPT SMALL O

Letterlike Symbols - Hebrew letterlike math symbols

U+2135 ( ℵ ) ALEF SYMBOL

U+2136 ( ℶ ) BET SYMBOL

U+2137 ( ℷ ) GIMEL SYMBOL

U+2138 ( ℸ ) DALET SYMBOL

Letterlike Symbols - Additional letterlike symbols

U+213C ( ℼ ) DOUBLE-STRUCK SMALL PI

U+213D ( ℽ ) DOUBLE-STRUCK SMALL GAMMA

U+213E ( ℾ ) DOUBLE-STRUCK CAPITAL GAMMA

U+213F ( ℿ ) DOUBLE-STRUCK CAPITAL PI

Letterlike Symbols - Double-struck italic math symbols

U+2145 ( ⅅ ) DOUBLE-STRUCK ITALIC CAPITAL D

U+2146 ( ⅆ ) DOUBLE-STRUCK ITALIC SMALL D

U+2147 ( ⅇ ) DOUBLE-STRUCK ITALIC SMALL E

U+2148 ( ⅈ ) DOUBLE-STRUCK ITALIC SMALL I

U+2149 ( ⅉ ) DOUBLE-STRUCK ITALIC SMALL J





Circled & Parenthesized

Note: all circled alphanumerics are So

ⓐⒶ ⓑⒷ ⓒⒸ ⓓⒹ ⓔⒺ ⓕⒻ ⓖ Ⓖ ⓗⒽ ⓘⒾ ⓙⒿ ⓚⓀ ⓛⓁ ⓜⓂ ⓝ Ⓝ ⓞⓄ ⓟⓅ ⓠⓆ ⓡⓇ ⓢⓈ ⓣⓉ ⓤ Ⓤ ⓥⓋ ⓦⓌ ⓧⓍ ⓨⓎ ⓩⓏ ㉠ ㉮ ㉡ ㉯ ㉢ ㉰ ㉣ ㉱ ㉤ ㉲ ㉥ ㉳ ㉦ ㉴ ㉧ ㉵ ㉾ ㉨ ㉶ ㉽ ㉩ ㉷ ㉼ ㉪ ㉸ ㉫ ㉹ ㉬ ㉺ ㉭㉻ ㋐-㋾ ㊤ ㊦ ㊥ ㊭ ㊡ ㊝ ㊢ ㊘ ㊩ ㊯ ㊞ ㊨ ㊔ ㊏ ㊰ ㊛ ㊫ ㊪ ㊧ ㊐ ㊊ ㊒ ㊍ ㊑ ㊣ ㊌ ㊟ ㊋ ㊕ ㊚ ㊬ ㊓ ㊗㊙ ㊖ ㊮ ㊜ ㊎ ㊠

except the following:

⓪ ① ⑩-⑲ ② ⑳ ㉑-㉙ ③ ㉚-㉟ ㊱-㊴ ④ ㊵-㊾ ⑤ ㊿ ⑥-⑨

 ㊀ ㊆ ㊂ ㊈ ㊁ ㊄ ㊇ ㊅ ㊉ ㊃



The Western numbers are understandable -- the uncircled values are gc=Number, but the uncircled Han characters are not.



Similarly, all parenthesized alphanums are So

⒜-⒵ ㈀ ㈎ ㈁ ㈏ ㈂ ㈐ ㈃ ㈑ ㈄ ㈒ ㈅ ㈓ ㈆ ㈔ ㈇ ㈕ ㈝ ㈞ ㈈ ㈖ ㈜ ㈉ ㈗ ㈊ ㈘ ㈋ ㈙ ㈌ ㈚ ㈍ ㈛ ㈹ ㈽ ㉁ ㈸ ㈿ ㈴ ㈺ ㈯ ㈻ ㈰ ㈪ ㈲ ㈭ ㈱ ㈬ ㈫ ㈵ ㈼ ㈳ ㈷ ㉀ ㉂ ㉃ ㈶ ㈾ ㈮

except the following:

⑴ ⑽-⒆ ⑵ ⒇ ⑶-⑼

㈠ ㈦ ㈢ ㈨ ㈡ ㈤ ㈧ ㈥ ㈩ ㈣



So should these be treated as So?

@No-Change

Enclosed CJK Letters And Months - Circled ideographs

U+3280 ( ㊀ ) CIRCLED IDEOGRAPH ONE

U+3281 ( ㊁ ) CIRCLED IDEOGRAPH TWO

U+3282 ( ㊂ ) CIRCLED IDEOGRAPH THREE

U+3283 ( ㊃ ) CIRCLED IDEOGRAPH FOUR

U+3284 ( ㊄ ) CIRCLED IDEOGRAPH FIVE

U+3285 ( ㊅ ) CIRCLED IDEOGRAPH SIX

U+3286 ( ㊆ ) CIRCLED IDEOGRAPH SEVEN

U+3287 ( ㊇ ) CIRCLED IDEOGRAPH EIGHT

U+3288 ( ㊈ ) CIRCLED IDEOGRAPH NINE

U+3289 ( ㊉ ) CIRCLED IDEOGRAPH TEN

Enclosed CJK Letters And Months - Parenthesized ideographs

U+3220 ( ㈠ ) PARENTHESIZED IDEOGRAPH ONE

U+3221 ( ㈡ ) PARENTHESIZED IDEOGRAPH TWO

U+3222 ( ㈢ ) PARENTHESIZED IDEOGRAPH THREE

U+3223 ( ㈣ ) PARENTHESIZED IDEOGRAPH FOUR

U+3224 ( ㈤ ) PARENTHESIZED IDEOGRAPH FIVE

U+3225 ( ㈥ ) PARENTHESIZED IDEOGRAPH SIX

U+3226 ( ㈦ ) PARENTHESIZED IDEOGRAPH SEVEN

U+3227 ( ㈧ ) PARENTHESIZED IDEOGRAPH EIGHT

U+3228 ( ㈨ ) PARENTHESIZED IDEOGRAPH NINE

U+3229 ( ㈩ ) PARENTHESIZED IDEOGRAPH TEN



