Re: Virama based model - a note (was: Malayalam digit zero - an error)

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Fri Apr 29 2005 - 13:09:39 CST

  • Next message: Asmus Freytag: "RE: String name and Character Name"

    From: "N. Ganesan" <naa.ganesan@gmail.com>
    > Quite simply, A. Nakanishi is wrong in stating the rules (3)
    > and (5) as far as Tamil script is formulated.
    > Tamil grammar, chief amidst Darvidian languages and,
    > one of the two classical languages
    > of India the other being Sanskrit, clearly
    > defines in Tolkaappiyam (its Ur-text dates to
    > pre-centuries BCE) a diacritic letter, puLLi (U+0BCD)
    > to generate "pure" consonant. Even Nakanishi does
    > not mention pulli by name, but mentions its
    > importance in Tamil script. Because pulli is
    > so well defined, Tamil never had to develop
    > conjunct consonants. So, automatically, thanks
    > to pulli, Tamil does opposite of Nakanishi's rule (5).
    > Btw, archaeologically, puLLi in Tamil is well
    > attested from second century onwards.
    > Because of absence of conjunct letters,
    > Tamil script was used first among Indic scripts
    > whenever a new technology appeared on the scene.
    > Examples are 1) printing 2) typewriters
    > 3) bilingual emails in 8-bit encodings like TSCII.
    > OCR is way easier for the lucid Tamil script compared to
    > any other Indic script.
    >
    > The contrast of conjunct consonants is seen
    > clearly when you compare Tamil script
    > with Devanagari or Tamil grantham script.
    > The scripts for Indo-Aryan languages never
    > have a clear concept of puLLi/viraama as an orthographic
    > device. As a result Hindi native speakers confuse and cut off -a sounds
    > in Sanskrit words even at places there is
    > no virama existing etc.,
    >
    > The use of the virAma in Sanskrit to refer to a written ligature
    > marking subtraction
    > of vowel 'a' from the consonant sign is very late, and not to be found in
    > the texts of Sanskrit grammarians.

    I fully agree with your analysis. And it confirms what I can also criticize
    in the current proposal for encoding Javanese (which is also based on the
    false assumption that the inherent vowel of Javanese letters is 'a', when in
    fact some consonnants are instead using an inherent 'e' vowel, notably 're'
    and 'le'.
    This affects the Virama-based model, which considers that it is used mostly
    to mark the absence of a vowel in the previous consonnant, when in fact it
    is a *leading* modifier to create the conjunct form of the *following*
    combining consonnant (which stil keeps its inherent vowel, 'a' or 'e', and
    to which the medial vowel signs apply).
    Also this model is not sufficient to describe scripts like Javanese because
    it forgets the case of final consonant signs like '-h' or '-ng' or '-r' or
    '-l') that have NO inherent vowel, and that Unicode sometimes describe
    (incorrectly) as 'vocalic l' or 'vocalic r'.

    The reality of those North Brahmic scripts, related to their common historic
    Dewanagri parent, is that they are much nearer with Tibetan than what the
    standard is supposedly describing. This apparent complexity of North Brahmic
    scripts was hidden artificially in the ISCII standard that wanted to unify
    abruptly these scripts with the same model, so that it would work also with
    South Brahmic scripts.

    The more fundamental model is the one based on the classification of
    consonnants with their own inherent vowel, into two alternate forms: a
    leading form with is the base for the creation of syllabic conjuncts, and a
    conjunct (medial) form that is combining with the base consonnant, without
    having themselves their own inherent vowel. So the vowel signs still modify
    the base consonnant and not the conjunct form of the medial consonnant, even
    if they are spelled after the medial consonnant.

    However I won't say that the Unicode standard is wrong. It just
    oversimplifies the problem, so that it allows people using the standard
    making false assumptions about the true structure of those Brahmic scripts.



    This archive was generated by hypermail 2.1.5 : Fri Apr 29 2005 - 13:12:14 CST