> The UTC will occasionally deprecate characters. This means that the Unicode
> Standard will identify the characters and (strongly) recommends that they
> not be used for creating *new* text data. However, these characters may
> already exist in existing data and implementations may need to handle this
> case.

Suggestion: Why don't you make this automatically computable from Unicode
data files?

> If you are aware of characters that have been encoded in error and where
> formally deprecating the character is advantageous, you can submit a paper
> to UTC explaining the situation and UTC can act on that. However, it's
> important to note that many characters exist for narrow purposes, such as
> support of historic documents, legacy encodings, or minority usage. Such
> characters may appear extraneous but they would likely *not* be deprecated.

I'm not interested in deprecating characters, but deprecating usages. You
just said that you will deprecate usage of U+0332 Combining Low Line for
underlining plain text. Am I getting it right, or you want to deprecate
the character itself?

When I asked about deprecation, I was thinking about the problem U+06C0
Arabic Letter Heh With Yeh Above will create for Persian applications if
it starts to be used widely. Currently, mostly because of its
unavailablity in the fonts distributed with Microsoft Windows, Office,
etc, it's rarely used.

The problem is that Unicode 3.0 defined that character to be cannonically
equivalent to "Arabic Letter Ae + Arabic Hamza Above", but the Persian
usage is Arabic Letter Heh + Arabic Hamza Above. Persian developers
(including me) can't even guess what is an Arabic Letter Ae and how it is
used. But consider the headache when they will encounter U+06C0 in
Normalization Form D... (I can explain more if I'm talking ambiguously.)

Can UTC deprecate this character and require using "Heh + Hamza Above"
or "Ae + Hamza Above" instead, based on the context?


