L2/05-014 Date: Thu, 20 Jan 2005 10:48:36 -0800 From: Markus Scherer Subject: Pattern_Syntax problematic characters Pattern_Syntax contains some characters that either have identifier-like properties or numeric values or are compatibility variants of such characters. I propose to remove the following characters from Pattern_Syntax: 1. The following 4 characters are also in ID_Continue (they are Pc=Connector_Punctuation) _ U+005F LOW LINE Connector_Punctuation Basic_Latin Zyyy - ON ‿ U+203F UNDERTIE Connector_Punctuation General_Punctuation Zyyy - ON ⁀ U+2040 CHARACTER TIE Connector_Punctuation General_Punctuation Zyyy - ON ⁔ U+2054 INVERTED UNDERTIE Connector_Punctuation General_Punctuation Zyyy - ON 2. The following 52 characters are also in Alphabetic: Circled letters A-Z and a-z ⒶⒷⒸⒹⒺⒻⒼⒽⒾⒿⓀⓁⓂⓃⓄⓅⓆⓇⓈⓉⓊⓋⓌⓍⓎⓏ ⓐⓑⓒⓓⓔⓕⓖⓗⓘⓙⓚⓛⓜⓝⓞⓟⓠⓡⓢⓣⓤⓥⓦⓧⓨⓩ 3. Other compatibility variants (circled, parenthesized, etc.) of letters and digits; the digit variants have numeric values. (190 characters) 2460..249B # No [60] CIRCLED DIGIT ONE..NUMBER TWENTY FULL STOP 249C..24E9 # So [78] PARENTHESIZED LATIN SMALL LETTER A..CIRCLED LATIN SMALL LETTER Z 24EA..24FF # No [22] CIRCLED DIGIT ZERO..NEGATIVE CIRCLED DIGIT ZERO 2776..2793 # No [30] DINGBAT NEGATIVE CIRCLED DIGIT ONE..DINGBAT NEGATIVE CIRCLED SANS-SERIF NUMBER TEN