L2/06-344

Source: "Mark Davis" <mark.davis@icu-project.org>
Date: 2006-10-10
Subject: Deprecated characters

The Deprecated property is relatively new, and we have not done a thorough
review of characters that should have that property. As a reminder,
Deprecated means discouraged, but does not mean that the character will
be removed. Having the property means that the characters that are
discouraged can be "recognized" programmatically.

The current list (U5.0) of deprecated characters is:

0340..0341    ; Deprecated # Mn   [2] COMBINING GRAVE TONE MARK..COMBINING ACUTE TONE MARK
17A3          ; Deprecated # Lo       KHMER INDEPENDENT VOWEL QAQ
17D3          ; Deprecated # Mn       KHMER SIGN BATHAMASAT
206A..206F    ; Deprecated # Cf   [6] INHIBIT SYMMETRIC SWAPPING..NOMINAL DIGIT SHAPES

I believe we should also deprecate the characters:

E0020    TAG SPACE
E0021    TAG EXCLAMATION MARK
...
E007E    TAG TILDE
E007F    CANCEL TAG

These were only encoded to forstall a truely ugly variant of UTF-8 from being
formulated, but we really need to strongly discourage their use.

===

The following are other characters listed in NamesList.txt as being discouraged or
having a preferred alternate. Some of which could also be considered for deprecation,
after review.

0344    COMBINING GREEK DIALYTIKA TONOS
    * use of this character is discouraged
    : 0308 0301
0344    COMBINING GREEK DIALYTIKA TONOS
    * use of this character is discouraged
    : 0308 0301
0F73    TIBETAN VOWEL SIGN II
    * use of this character is discouraged
    : 0F71 0F72
0F75    TIBETAN VOWEL SIGN UU
    * use of this character is discouraged
    : 0F71 0F74
0F77    TIBETAN VOWEL SIGN VOCALIC RR
    * use of this character is strongly discouraged
    # 0FB2 0F81
0F79    TIBETAN VOWEL SIGN VOCALIC LL
    * use of this character is strongly discouraged
    # 0FB3 0F81
0F81    TIBETAN VOWEL SIGN REVERSED II
    * use of this character is discouraged
    : 0F71 0F80
17D3    KHMER SIGN BATHAMASAT *
    * originally intended as part of lunar date symbols
    * use of this character is strongly discouraged in favor of the complete set of lunar date symbols
    x (khmer symbol pathamasat - 19E0)
17D8    KHMER SIGN BEYYAL *
    * et cetera
    * use of this character is discouraged; other abbreviations for et cetera also exist
    * preferred spelling: 17D4 179B 17D4

@+        These are discouraged for mathematical use because of their canonical equivalence to CJK punctuation.
2329    LEFT-POINTING ANGLE BRACKET
    x (less-than sign - 003C)
    x (single left-pointing angle quotation mark - 2039)
    x (mathematical left angle bracket - 27E8)
    : 3008 left angle bracket
232A    RIGHT-POINTING ANGLE BRACKET
    x (greater-than sign - 003E)
    x (single right-pointing angle quotation mark - 203A)
    x (mathematical right angle bracket - 27E9)
    : 3009 right angle bracket

2126    OHM SIGN
    * SI unit of resistance, named after G. S. Ohm, German physicist
    * preferred representation is 03A9
    : 03A9 greek capital letter omega
20A4    LIRA SIGN
    * intended for lira, but not widely used
    * preferred character for lira is 00A3
    x (pound sign - 00A3)
212B    ANGSTROM SIGN
    * non SI length unit (=0.1 nm) named after A. J. Ångström, Swedish physicist
    * preferred representation is 00C5
    : 00C5 latin capital letter a with ring above




> From: "Andrew West" <andrewcwest@gmail.com>
> Date: 2006-10-10 11:22:03 -0700

On 10/10/06, Mark Davis <mark.davis@icu-project.org> wrote:
>
> The following are other characters listed in NamesList.txt as being discouraged or
> having a preferred alternate. Some of which could also be considered for deprecation,
> after review.

Also :

17A3	KHMER INDEPENDENT VOWEL QAQ *
	* originally intended only for Pali/Sanskrit transliteration
	* use of this character is strongly discouraged; 17A2 should be used instead
17A4	KHMER INDEPENDENT VOWEL QAA *
	* used only for Pali/Sanskrit transliteration
	* use of this character is discouraged; the sequence 17A2 17B6 should
be used instead


Andrew


From: "Mansour, Kamal" <kamal.mansour@monotypeimaging.com>
Date: 2006-10-11 10:22:15 -0700
To: Mark Davis <mark.davis@icu-project.org>
Subject: Re: UTC Agenda Items: Deprecated characters
CC: UTC <unicore@unicode.org>
Sender: unicore-bounce@unicode.org

In line with the Ohm and Angstrom signs, I’d like to add another character to the list to be considered for deprecation:

212A    KELVIN SIGN
: 004B K latin capital letter K

Kamal