Re: Unicode and transliteration

From: Jeroen Hellingman (
Date: Sat Aug 28 1999 - 07:32:44 EDT

This is one of the issues in Indic scripts, Unicode does not yet
address. My guess would be to have a zero width joiner somewhere
in betweem the consonant and the viram to get this letter analogue to
the way half letters are produced in Devanagari.

Another issue is how to represent a cillu letter with a subscribed
consonant, I've made some proposals that can be viewed

but these pages are in dire need of upgrading.


-----Original Message-----
From: <>
To: Unicode List <>
Date: Saturday, August 28, 1999 01:52
Subject: RE: Unicode and transliteration

>Can somebody tell me how I can write 'cillu' 'RA' in Unicode.
>For that matter, any 'cillu' form of a Malayalam character.
>I have put the glyph of 'cillu' form of 'RA'(U+0D31) in
>Please have a look at it.
>A character X having 'cillu' form may assume it when the next character Y
>is not a vowel. But, if Y has a symbol for it (eg: 'YA'), X may not assume
>Instead it will stay in its original form. This depends upon the words in
>XY combination appear. Two conflicting examples are: 'karmmam' and
>> -----Original Message-----
>> From: []
>> Sent: Wednesday, August 25, 1999 7:15 PM
>> To: Unicode List
>> Cc: Unicode List
>> Subject: Re: Unicode and transliteration
>> My question was not based on any implementations or other
>> standards.
>> Given an encoding for a language, a Unicode font should
>> be able to reproduce
>> all the glyphs
>> existing in that language. As all you know, it is not
>> that intuitive. For
>> example, Malayalam
>> 'ra'(0D31) can take at least three glyph forms depending
>> on the context.
>> Whether all these
>> glyphs are reproducable from a Unicode font should be a
>> concern. For this,
>> rules
>> should be defined, describing such and such combination
>> produces such and
>> such glyph. As an
>> example, <ra> + <zwj> produces 'cillu' glyph of 'ra'. Is
>> such a rule set
>> existing ?
>In other words, the form is governed by some syntactic rule? Sounds
>like Arabic shaping rules.
>Should the different glyph forms be considered distinct shapes, or variants
>on a particular shape? More importantly, do they carry different semantic
>Does U+0D31 reflect a grammatical concept that goes beyond the phonemic?
>My reading of Unicode is that it does not handle the mapping from deep to
>surface structure; it only does the surface, the images used in written

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:51 EDT