From: Cibu (cibucj@gmail.com)
Date: Tue Mar 22 2005 - 18:23:49 CST
Hi,
Since Chillu-NA and NA + visible VIRAMA can give different meaning to
a word, we cannot let the rendering system choose. Therefore, here are
my preferences in the decreasing order:
1) Explicitly encode Chillu characters. Various issues are discussed
in detail below.
2) <NA, VIRAMA> (without any joiner) should be mapped to NA with a
visible Virama because, it will enforce uniformity. That is, Consonant
+ VIRAMA will form visible Virama symbol, irrespective of whether the
consonant is capable of forming a Chillu or not. Example SA + VIRAMA
and NA + VIRAMA will have visible Virama symbol.
Issues in current representation of a Chillu letter as Consonant + Virama + ZWJ
1) ZWJ and ZWNJ are supposed to be font directives, directing a font
to select from two or more semantically same renderings. In case of
Malayalam, this is no longer true. ZWJ becomes an alien language
construct introduced to Malayalam by Unicode to produce Chillu
letters. Thus, it is possible to produce two semantically different
words, which differ only by ZWJ in their Unicode representation.
Example: അവന് (avan – meaning 'he') & അവന് (avan~ - meaning 'for
him')
2) When a word is searched in Unicode text, the search algorithm
should ignore ZWJ & ZWNJ because it should not care about the
rendering of the word. From the first reasoning, this does not hold
good for Malayalam. However, if search algorithm does not ignore ZWJ &
ZWNJ, then it surely is going to miss some words, which are
semantically same but rendered differently by using/omitting ZWJ/ZWNJ.
3) Chillu of a consonant is different from its C1-conjoining form
without inherent അ (A).
3.1)Phonetic differences
Consider the combination: Vow + CC + Con.
Vow - a vowel
CC - a consonant capable of forming Chillu
Con - a consonant
When CC takes its Chillu form, it is joins more with Vow. This effect
produces a noticeable small stop between CC and Con.
When CC takes, its C2/C1-conjoining forming form without inherent അ
(A), it is pronounced closer to Con.
Examples:
ഉണര്വ് ഉണര്വ് (unlike its pair, not a meaningful word)
കല്വിളക്ക് വില്വാദ്രി
കണ്വട്ടം കണ്വന്
4) Chillu of a consonant can be treated as Anusvara
A. R. Raja Raja Varma states in his Keralapanineeyam (which is the
foremost grammar book of Malayalam) "Anusvara is the Chillu of MA".
Thus, we can say that Malayalam has more than one Anusvara. There is
Anusvara for MA; there is Anusvara for NA, NNA, LA etc. This is
essentially same as saying Malayalam got some number of Chillus, which
includes MA, NA, LA etc.
If we look closely, the phonetic rules are also same for Anusvara and
other Chillus. Most importantly the half stop property (please see
Appendix A), if it occurs in the middle of a word. Examples:
സംയുക്തം സാമ്യം
കല്വിളക്ക് വില്വാദ്രി
കണ്വട്ടം കണ്വന്
Essentially this means Unicode should do either of:
1. Include separate character locations for Chillu characters
- solves the confusion of ല് (Chillu of LA/TA) (see below)
- Addresses above mentioned Chillu representation issues
2. Allow Anusvara to be encoded as MA + Virama + ZWJ
- does not change existing encoding for Chillu
- does not address previously explained Chillu representation issues
Background
----------
A) Overloading of visible Virama in Malayalam
Following are its functions:
A.1) at end of a word, it acts as quarter vowel ഉ (U). Example: അവന് (avan~)
A.2) In the middle of a word, it means the consonant before is forming
a conjunct with consonant after. Example: ശബ്ദം (Sabdam) In this
context, it does not produce any sound what so ever.
Functionality-(A.2) has been overloaded with this grapheme when
typesetting friendly new orthography has been introduced. Unicode
recognizes functionality-(A.2) alone with visible Virama of Malayalam.
This contributes to the problem that Unicode representation of അവന്
(avan) & അവന് (avan~) being different only by ZWJ/ZWNJ.
B) Evolution & Confusion of ല് (Chillu LA/TA)
For Sanskrit words used Malayalam, ത (TA) is pronounced as it is, only
when a vowel or semi-vowel comes after it. For all other occasions, it
is pronounced as ല (LA).
An example would be ഉത്സവം (ulsavam). Even though, it's Sanskrit
originated form is ഉത്സവം (uthsavam), it is pronounced in Malayalam
as ഉല്സവം (ulsavam).
This means, Chillu form of ത (TA) should be pronounced as if it is
Chillu form of ല (LA). Thus, ല് (chillu LA/TA) is in a very curious
situation:
B.1) Grapheme level: Graphically it is Chillu of ത (TA).
B.2) Character level: It can represent the characters – either ത (TA) or ല (LA).
B.3) Phoneme level: Its pronunciation is the Chillu of ല (LA).
Reference: കേരളപാണിനീയം (kEraLapaaNineeyam), പീഠിക (peeThika) - A. R.
Raja Raja Varma
thanks,
Cibu
-- More about me: http://www.blogger.com/profile/1246232
This archive was generated by hypermail 2.1.5 : Tue Mar 22 2005 - 18:32:48 CST