Unicode code points of Tamil Grantham conjunct SRI

From: N. Ganesan (naa.ganesan@gmail.com)
Date: Sun May 01 2005 - 08:54:27 CDT

  • Next message: David Ulbrich: "Cyrillic - accented/acuted vowels"

    Unicode code points of Tamil Grantham conjunct SRI
    --------------------------------------------------------------------------------

    Some may recall in the list about the last month discussions on
    Visarga and Aaytham,
    the intricate relationship between them as recorded in scholarly publications,
    and even the word itself, aaytham deriving from a visarga term, aa'srita
    of Sanskrit, aaytham is quite different from aayutham 'weapon' etc.,
    Mentions and mails
    with unattested words like VisargaL etc., seem to have abated.

    -------

    Likewise, it was felt essential to tell about the basic Unicode code
    points of Sanskrit
    term, SRI as used in all of India, and its Tamil Grantham codepoints.

    The Unicode-accepted proposal on sha (U+0bb6) correctly
    identifies SRI as being <0BB6, 0BCD, 0BB0, 0BC0>. It mentions SRI
    ligature being made up of U+0bb6 prominently:
    http://std.dkuug.dk/jtc1/sc2/wg2/docs/n2617.pdf
    Section 2.3 explicitly mentions the use of U+0bb8 in SRI ligature as
    *incorrect*.

    The review document with WG02 (Unicode) document
    number is n2618,
    http://wwwold.dkuug.dk/JTC1/SC2/WG2/docs/n2618
    which talks about SRI and its component sha (0bb6).
    The WG02 document clearly specifies why
    sha (0bb6) is needed for Tamil:
    "ISCII included letters for {Ss}, {s}, {h}
    but left out the letter for {sh} in Tamil. This
    resulted in a major deficiency in the code
    - for instance, there is no way of representing
    the backing string of a very important 'akshara' in
    the language viz., {SRI}".

    I hear often that sometimes SRI is written
    differently. Yes, 100% agreed. Tamil nativizes the borrowed
    loan words and letters of Sanskrit Grantham letters,
    conjuncts differently. In fact, it is one yardstick
    used by linguistics specialists to show that
    a particular word is a borrowal in a language.
    Take the conjunct, kSha (Thank God, Unicode
    does not give it a separate code point unlike
    hacked encodings). kSha is tamilized in various
    ways: -kk-, -cc-, -Tc- and so on, with additinal
    operative rule that word initially, kSha- will
    become k-, or c-. Likewise, Sri ligature is tamilized
    in many ways: eg., tiru or cirii (long standing usage.
    See Azhvar paasuams) or something else.
    But these nativization attempts differ from
    person to person, time to time, district to district.
    In English script, SRI conjunct is written in mulptiple
    ways: sri, srii, sri_with_a_macron, s(acute)ri(macron),
    sree, shree. shrii, shri, ... As we know well, Tamil script
    also can do different attempts at nativization of the
    loan word, SRI from Sanskrit. Like cirii, cii (ciitaran,
    ciivalappEri, a town in Tinnevelly dist. a movie
    was ciivalappEri paaNDi. ciivalan < srivallabhan),
    siri, sirii, ... all these r can be replaced with R by some,
    also s(0bb6) can also be replaced with 0bb8, 0bb7 and so on.
    So many combinations and permutations, a bewildering array, is
    possible. These tamilizing attempts can be seen in nonconjunct
    and conjunct ksha also: -Tc-, -kk-, -cc-, with additional
    operative rule that word-initial consonants in Tamil
    words will be elided.

    I wrote a letter to Sri. Kalyan, (Project Madurai)
    explaining the need to use the
    correct code point for the Tamil
    Grantha ligature, SRI as
    <U+0BB6, U+0BCD, U+0BB0, U+0BC0> ,
    http://www.services.cnrs.fr/wws/arc/ctamil/2005-04/msg00034.html
    These code points and their equivalents
    are used not just in Tamil but through out
    India to produce the conjunct Shree
    (whatever the Indic script may be).

    Hence, *definition* of Sanskrit Grantham ligature:
    SRI = <0BB6, 0BCD, 0BB0, 0BC0>
    This is used all across India.
    Hence, my recommendation is to use this long standing
    usage in the future documents.

    Hope this helps,
    Naga Ganesan, Ph.D.



    This archive was generated by hypermail 2.1.5 : Sun May 01 2005 - 08:56:18 CDT