From: N. Ganesan (email@example.com)
Date: Mon May 16 2005 - 14:09:08 CDT
P. Verdy wrote:
>I fully agree with your analysis. And it confirms what I can also criticize
>in the current proposal for encoding Javanese (which is also based on the
>false assumption that the inherent vowel of Javanese letters is 'a', when in
>fact some consonnants are instead using an inherent 'e' vowel, notably 're'
Is this not covered in R. Ishida's tech note, UTN #10?
Page 3 mentions "The inherent vowel can vary
in pronunciation from script to script, and examples
include "U+0259" "U+028C", "U+0254" ".
Perhaps, in the Javanese script description in the standard, more
vowels taken out by virama can be recorded.
>This affects the Virama-based model, which considers that it is used mostly
>to mark the absence of a vowel in the previous consonnant, when in fact it
>is a *leading* modifier to create the conjunct form of the *following*
>combining consonnant (which stil keeps its inherent vowel, 'a' or 'e', and
>to which the medial vowel signs apply).
Your statement on what Virama does is correct for North Indian
scripts, but not for Tamil.
The puLLi (virama in Unicode) is stated explicitly in
Tamil grammars and 2nd century epigraphy onwards
that puLLi kills the inherent vowel /a/. Pure consonants
in Tamil are individual units, and they do not form
conjuncts like other Indian scripts.
So, Tamil PuLLi (viraamam) is *not* a leading modifier to create the
conjunct form of the following combining consonant(which stil keeps
its inherent vowel, 'a' or 'e', and to which the medial vowel signs
While other Indian languages do not use virama
to produce consonant clusters, Tamil extensively
uses virama to create pure consonants, whether in clusters or single.
(Nowadays, in Unicode fonts lack richness of
ligatures, explicit virama is used following the
model of Tamil script in Hindi &so on).
This misunderstanding causes problems for Tamil in hyphenation while
splitting words in webpages, word processing. Chapter 9.0, Unicode
std. states that:
In the Tamil script, a consonant cluster is any sequence of one or
more consonants separated by viramas, possibly terminated by a virama.
While this is true additional statements are needed. Eg.,
Syllable boundaries in Tamil written texts do not
involve two or more consonants in Tamil. Here,
each consonant stands alone and, letters
adjacent to a pure consonant (indicated by
puLLi dot) can be consonants or abugidas.
Tamil words like "illai", "angkE", "vAyppu"
should not be split i-llai, a-ngkE, vA-yppu
for hyphenation. Since Tamil words do not start
with pure consonant letters, the above example
words must be split as il-lai, ang-kE, vAyp-pu
and so on. In sum, consonat clusters must be treated
differently in Tamil, not as Sanskrit.
Earlier I wrote:
This leads us to mention an important point about Virama (=viraama)
based models of Indian and even South East Asian languages. Nakanishi
states general principles of Indian
lettering system on p. 48
Devanagari script uses the basic system used for all
the Indian scripts described in this chapter.
(3) Each consonant includes an inherent a-vowel.
(5) Conjunct consonants are used; when two or more
consonants are combined with no intervening vowel,
they are written as one letter.
Quite simply, A. Nakanishi is wrong in stating the rules (3)
and (5) as far as Tamil script is formulated.
Tamil grammar, chief amidst Dravidian languages and,
one of the two classical languages
of India the other being Sanskrit, clearly
defines in Tolkaappiyam (its Ur-text dates to
pre-centuries BCE) a diacritic letter, puLLi (U+0BCD)
to generate "pure" consonant. Even Nakanishi does
not mention pulli by name, but mentions its
importance in Tamil script. Because pulli is
so well defined, Tamil never had to develop
conjunct consonants. So, automatically, thanks
to pulli, Tamil does opposite of Nakanishi's rule (5).
Btw, archaeologically, puLLi in Tamil is well
attested from second century onwards.
Because of absence of conjunct letters,
Tamil script was used first among Indic scripts
whenever a new technology appeared on the scene.
Examples are 1) printing 2) typewriters
3) bilingual emails in 8-bit encodings like TSCII.
OCR is way easier for the lucid Tamil script compared to
any other Indic script.
The contrast of conjunct consonants is seen
clearly when you compare Tamil script
with Devanagari or Tamil grantham script.
The scripts for Indo-Aryan languages never
have a clear concept of puLLi/viraama as an orthographic
device. As a result Hindi native speakers confuse and cut off -a sounds
in Sanskrit words even at places there is
no virama existing etc.,
The use of the virAma in Sanskrit to refer to a written ligature
of vowel 'a' from the consonant sign is very late, and not to be found in
the texts of Sanskrit grammarians. In those works, the term virAma does
exist, but it marks the end of an utterance cf. virAmo 'vasAnam (Panini 1.4.
110), or a pause. Its immediate reference is phonetic (cessation of the
phonetic process of utterance), and not orthographic.
The phonetic reference of virama is seen in character names
U+0964 and U+0965, viz, puurNa viraama and diirgha viraama.
So, viraama is really to stop pronouncing, say at the end of
a statement or verse.
On the other hand, Tamils devised puLLi orthographically
to do a job - to "kill" inherent -a in the so called "consonants"
in other Indic languages. Nakanishi rule (3) is invalid for Tamil!
So, will write a small proposal to include data on puLLi in Tamil,
its definition in ancient Tamil grammars and epigraphs,
and its use in making Tamil script lot simpler and lucid
in the info on Indic script characteristics in Devanagari section, Ch.
9 of the Unicode standard.
This archive was generated by hypermail 2.1.5 : Mon May 16 2005 - 14:10:03 CDT