L2/06-344 Source: "Mark Davis" Date: 2006-10-10 Subject: Deprecated characters The Deprecated property is relatively new, and we have not done a thorough review of characters that should have that property. As a reminder, Deprecated means discouraged, but does not mean that the character will be removed. Having the property means that the characters that are discouraged can be "recognized" programmatically. The current list (U5.0) of deprecated characters is: 0340..0341    ; Deprecated # Mn   [2] COMBINING GRAVE TONE MARK..COMBINING ACUTE TONE MARK 17A3          ; Deprecated # Lo       KHMER INDEPENDENT VOWEL QAQ 17D3          ; Deprecated # Mn       KHMER SIGN BATHAMASAT 206A..206F    ; Deprecated # Cf   [6] INHIBIT SYMMETRIC SWAPPING..NOMINAL DIGIT SHAPES I believe we should also deprecate the characters: E0020    TAG SPACE E0021    TAG EXCLAMATION MARK ... E007E    TAG TILDE E007F    CANCEL TAG These were only encoded to forstall a truely ugly variant of UTF-8 from being formulated, but we really need to strongly discourage their use. === The following are other characters listed in NamesList.txt as being discouraged or having a preferred alternate. Some of which could also be considered for deprecation, after review. 0344    COMBINING GREEK DIALYTIKA TONOS     * use of this character is discouraged     : 0308 0301 0344    COMBINING GREEK DIALYTIKA TONOS     * use of this character is discouraged     : 0308 0301 0F73    TIBETAN VOWEL SIGN II     * use of this character is discouraged     : 0F71 0F72 0F75    TIBETAN VOWEL SIGN UU     * use of this character is discouraged     : 0F71 0F74 0F77    TIBETAN VOWEL SIGN VOCALIC RR     * use of this character is strongly discouraged     # 0FB2 0F81 0F79    TIBETAN VOWEL SIGN VOCALIC LL     * use of this character is strongly discouraged     # 0FB3 0F81 0F81    TIBETAN VOWEL SIGN REVERSED II     * use of this character is discouraged     : 0F71 0F80 17D3    KHMER SIGN BATHAMASAT *     * originally intended as part of lunar date symbols     * use of this character is strongly discouraged in favor of the complete set of lunar date symbols     x (khmer symbol pathamasat - 19E0) 17D8    KHMER SIGN BEYYAL *     * et cetera     * use of this character is discouraged; other abbreviations for et cetera also exist     * preferred spelling: 17D4 179B 17D4 @+        These are discouraged for mathematical use because of their canonical equivalence to CJK punctuation. 2329    LEFT-POINTING ANGLE BRACKET     x (less-than sign - 003C)     x (single left-pointing angle quotation mark - 2039)     x (mathematical left angle bracket - 27E8)     : 3008 left angle bracket 232A    RIGHT-POINTING ANGLE BRACKET     x (greater-than sign - 003E)     x (single right-pointing angle quotation mark - 203A)     x (mathematical right angle bracket - 27E9)     : 3009 right angle bracket 2126    OHM SIGN     * SI unit of resistance, named after G. S. Ohm, German physicist     * preferred representation is 03A9     : 03A9 greek capital letter omega 20A4    LIRA SIGN     * intended for lira, but not widely used     * preferred character for lira is 00A3     x (pound sign - 00A3) 212B    ANGSTROM SIGN     * non SI length unit (=0.1 nm) named after A. J. Ångström, Swedish physicist     * preferred representation is 00C5     : 00C5 latin capital letter a with ring above > From: "Andrew West" > Date: 2006-10-10 11:22:03 -0700 On 10/10/06, Mark Davis wrote: > > The following are other characters listed in NamesList.txt as being discouraged or > having a preferred alternate. Some of which could also be considered for deprecation, > after review. Also : 17A3 KHMER INDEPENDENT VOWEL QAQ * * originally intended only for Pali/Sanskrit transliteration * use of this character is strongly discouraged; 17A2 should be used instead 17A4 KHMER INDEPENDENT VOWEL QAA * * used only for Pali/Sanskrit transliteration * use of this character is discouraged; the sequence 17A2 17B6 should be used instead Andrew From: "Mansour, Kamal" Date: 2006-10-11 10:22:15 -0700 To: Mark Davis Subject: Re: UTC Agenda Items: Deprecated characters CC: UTC Sender: unicore-bounce@unicode.org In line with the Ohm and Angstrom signs, I’d like to add another character to the list to be considered for deprecation: 212A    KELVIN SIGN : 004B K latin capital letter K Kamal