Re: Deprecated characters in Unicode 5.1 vs Unicode 5.2

From: Kenneth Whistler (kenw@sybase.com)
Date: Tue Aug 17 2010 - 14:03:55 CDT

  • Next message: Kenneth Whistler: "Re: CSUR Tonal"

    Hyungrok Kim asked:

    > The characters U+0340, U+0341, and U+17D3 are deprecated in Unicode 5.1.
    > In Unicode 5.2, however, they are no longer marked deprecated per
    > PropList.txt (even though they are still "discouraged" in the code
    > charts).
    > A. is this intentional?

    Yes.

    > B. if so, what was the rationale? (references to UTC meetings/JTC1
    > docs would be helpful)

    JTC1 documents would be irrelevant. The property Deprecated is a
    *Unicode* character property, maintained in the Unicode Character
    Database; SC2 does not have a concept of deprecating characters
    in a character encoding standard such as 10646.

    The relevant decision was based on Public Review Issue #122,
    and the decision taken by the UTC was recorded as 116-C13.

    The public copy of the minutes can be seen at:

    http://www.unicode.org/consortium/utc-minutes/UTC-116-200808.html

    Public Review Issue #122 and its resolution can be seen at:

    http://www.unicode.org/review/resolved-pri-100.html#pri122

    The UTC discussion essentially concluded that "Deprecated" had
    been used too imprecisely, and should be limited to only those
    characters which had serious *architectural* problems for their
    use -- not merely characters that amounted to duplicates or
    ones that various people felt should be discouraged from use for
    one reason or another.

    > C. shouldn't there be a Stability Policy w.r.t. deprecation? (i.e.,
    > a character mayn't be resurrected from deprecation)

    Nope. The UTC did not conclude that.

    >
    > I did google for the change, but couldn't unearth anything. Also the
    > Stability Policy doesn't cover deprecation to the best of my knowledge.

    Correct.

    Note that marking as character as Deprecated=True in the Unicode
    Character Database does not *remove* it from the standard. Deprecation
    does send a strong signal that a character should not be used,
    because it has architectural problems that may interfere with
    whatever its original intent was, but deprecation does not
    even mean that implementations cannot and will not use them in
    some limited contexts. For example, the Unicode Normalization
    Algorithm *requires* being able to provide the correct normalized
    value for a deprecated character, just as much as for any other
    character. An implementation of Unicode Normalization which
    did not do so, and which tried to ignore deprecated characters,
    would simply be non-conformant.

    --Ken

    > Finally, the beta 6.0 deprecated characters are a superset of 5.2's
    > deprecated characters (but not 5.1's). So if there is a problem, both
    > 5.2 and 6.0 need corrections.



    This archive was generated by hypermail 2.1.5 : Tue Aug 17 2010 - 14:05:57 CDT