Re: Tamil sha (U+0BB6) - deprecate it?

From: Asmus Freytag (
Date: Sun Jun 26 2005 - 12:56:40 CDT

  • Next message: Richard Wordingham: "Re: Authography and Grammar"

    At 06:55 AM 6/26/2005, N. Ganesan wrote:
    >Srivas mentions in his mails about
    >deprecating SHA (U+0BB6).
    >This cannot be done as there are some users
    >who want to transliterate Sanskrit words
    >one-to-one need sha letter.
    >Wrote this in the INFITT WG on Unicode:
    >Deprecation of U+0BB6 cannot be done because:
    >a) Unicode stability policy prohibits moving or removing
    >an encoded letter. If some user does not want it, that's fine.

    If U+0BB6 is in active use, then there's no reason to deprecate it.

    However, I wanted to make a point about what "deprecation" means
    in the context of the Unicode Standard. It is correct that characters
    do not get removed. However, in certain circumstances, their use
    can be officially discouraged - usually because a better alternative
    is available.

    There are several levels. At the most informal, the standard
    might simply point out a preferred alternative. At the most formal,
    the character is given the 'deprecated' status.

    A deprecated character is retained in the standard, and conformant
    implementations can still support it (and in fact are encouraged
    to treat it differently from a mere unassigned code point).
    However, spell-checkers and similar applications are encouraged
    to flag the use of a deprecated character, so that the user
    can replace it by a different character.

    In other words, a deprecated character is a character that's
    retained in the standard, in case some documents exist that
    use it, but there is no reason for current users or new
    documents to ever employ it.

    This is different from a special use character. For example,
    several characters in the Greek block are intended only for
    technical and scientific use. In ordinary Greek text, they
    should not be used, even though their form matches that of
    some ordinary characters (in some fonts).

    Such characters would be discouraged for use in ordinary
    text, but they would never be deprecated. Spell checkers
    and similar programs are encouraged to flag their use as
    an error, if it occurs in the middle of Greek words.

    It might be the case that U+0BB6 is such a special use
    character - used in some specific context, but discouraged
    for ordinary text.


    >b) U+0BB6 is very much needed for one-to-one transliteration
    >between Indic scripts including Tamil. (This is the main
    >reason I was enthusiastic about 0bb6 getting into Tamil
    >code chart)
    >c) Sri conjunct, a Sanskrit loan word in Tamil, is defined
    >using 0bb6 and its equivalent in all Indian languages.
    >(Yes, there could be approximating transcriptions,
    >but SRI's exact transliteration needs 0bb6).
    >Naga Ganesan

    This archive was generated by hypermail 2.1.5 : Sun Jun 26 2005 - 12:58:48 CDT