Re: Virama based model - a note

From: N. Ganesan (
Date: Mon May 16 2005 - 14:09:08 CDT

  • Next message: Kenneth Whistler: "Re: Corrections to Glagolitic"

    P. Verdy wrote:
    >I fully agree with your analysis. And it confirms what I can also criticize
    >in the current proposal for encoding Javanese (which is also based on the
    >false assumption that the inherent vowel of Javanese letters is 'a', when in
    >fact some consonnants are instead using an inherent 'e' vowel, notably 're'
    >and 'le'.

    Is this not covered in R. Ishida's tech note, UTN #10?
    Page 3 mentions "The inherent vowel can vary
    in pronunciation from script to script, and examples
    include "U+0259" "U+028C", "U+0254" ".

    Perhaps, in the Javanese script description in the standard, more
    vowels taken out by virama can be recorded.

    >This affects the Virama-based model, which considers that it is used mostly
    >to mark the absence of a vowel in the previous consonnant, when in fact it
    >is a *leading* modifier to create the conjunct form of the *following*
    >combining consonnant (which stil keeps its inherent vowel, 'a' or 'e', and
    >to which the medial vowel signs apply).

    Your statement on what Virama does is correct for North Indian
    scripts, but not for Tamil.

    The puLLi (virama in Unicode) is stated explicitly in
    Tamil grammars and 2nd century epigraphy onwards
    that puLLi kills the inherent vowel /a/. Pure consonants
    in Tamil are individual units, and they do not form
    conjuncts like other Indian scripts.

    So, Tamil PuLLi (viraamam) is *not* a leading modifier to create the
    conjunct form of the following combining consonant(which stil keeps
    its inherent vowel, 'a' or 'e', and to which the medial vowel signs

    While other Indian languages do not use virama
    to produce consonant clusters, Tamil extensively
    uses virama to create pure consonants, whether in clusters or single.
    (Nowadays, in Unicode fonts lack richness of
    ligatures, explicit virama is used following the
    model of Tamil script in Hindi &so on).

    This misunderstanding causes problems for Tamil in hyphenation while
    splitting words in webpages, word processing. Chapter 9.0, Unicode
    std. states that:
    In the Tamil script, a consonant cluster is any sequence of one or
    more consonants separated by viramas, possibly terminated by a virama.

    While this is true additional statements are needed. Eg.,
    Syllable boundaries in Tamil written texts do not
    involve two or more consonants in Tamil. Here,
    each consonant stands alone and, letters
    adjacent to a pure consonant (indicated by
    puLLi dot) can be consonants or abugidas.

    Tamil words like "illai", "angkE", "vAyppu"
    should not be split i-llai, a-ngkE, vA-yppu
    for hyphenation. Since Tamil words do not start
    with pure consonant letters, the above example
    words must be split as il-lai, ang-kE, vAyp-pu
    and so on. In sum, consonat clusters must be treated
    differently in Tamil, not as Sanskrit.

    N. Ganesan

    Earlier I wrote:
    This leads us to mention an important point about Virama (=viraama)
    based models of Indian and even South East Asian languages. Nakanishi
    states general principles of Indian
    lettering system on p. 48
    [Begin Quote]
    Devanagari script uses the basic system used for all
    the Indian scripts described in this chapter.
    (3) Each consonant includes an inherent a-vowel.
    (5) Conjunct consonants are used; when two or more
    consonants are combined with no intervening vowel,
    they are written as one letter.
    [End Quote]
    Quite simply, A. Nakanishi is wrong in stating the rules (3)
    and (5) as far as Tamil script is formulated.
    Tamil grammar, chief amidst Dravidian languages and,
    one of the two classical languages
    of India the other being Sanskrit, clearly
    defines in Tolkaappiyam (its Ur-text dates to
    pre-centuries BCE) a diacritic letter, puLLi (U+0BCD)
    to generate "pure" consonant. Even Nakanishi does
    not mention pulli by name, but mentions its
    importance in Tamil script. Because pulli is
    so well defined, Tamil never had to develop
    conjunct consonants. So, automatically, thanks
    to pulli, Tamil does opposite of Nakanishi's rule (5).
    Btw, archaeologically, puLLi in Tamil is well
    attested from second century onwards.
    Because of absence of conjunct letters,
    Tamil script was used first among Indic scripts
    whenever a new technology appeared on the scene.
    Examples are 1) printing 2) typewriters
    3) bilingual emails in 8-bit encodings like TSCII.
    OCR is way easier for the lucid Tamil script compared to
    any other Indic script.
    The contrast of conjunct consonants is seen
    clearly when you compare Tamil script
    with Devanagari or Tamil grantham script.
    The scripts for Indo-Aryan languages never
    have a clear concept of puLLi/viraama as an orthographic
    device. As a result Hindi native speakers confuse and cut off -a sounds
    in Sanskrit words even at places there is
    no virama existing etc.,
    The use of the virAma in Sanskrit to refer to a written ligature
    marking subtraction
    of vowel 'a' from the consonant sign is very late, and not to be found in
    the texts of Sanskrit grammarians. In those works, the term virAma does
    exist, but it marks the end of an utterance cf. virAmo 'vasAnam (Panini 1.4.
    110), or a pause. Its immediate reference is phonetic (cessation of the
    phonetic process of utterance), and not orthographic.
    The phonetic reference of virama is seen in character names
    U+0964 and U+0965, viz, puurNa viraama and diirgha viraama.
    So, viraama is really to stop pronouncing, say at the end of
    a statement or verse.
    On the other hand, Tamils devised puLLi orthographically
    to do a job - to "kill" inherent -a in the so called "consonants"
    in other Indic languages. Nakanishi rule (3) is invalid for Tamil!
    So, will write a small proposal to include data on puLLi in Tamil,
    its definition in ancient Tamil grammars and epigraphs,
    and its use in making Tamil script lot simpler and lucid
    in the info on Indic script characteristics in Devanagari section, Ch.
    9 of the Unicode standard.

    This archive was generated by hypermail 2.1.5 : Mon May 16 2005 - 14:10:03 CDT