From: Richard Wordingham <>
Date: Tue, 16 Aug 2011 02:59:14 +0100

On Mon, 15 Aug 2011 07:21:20 +0530
Shriramana Sharma <> wrote:

> On 08/15/2011 01:48 AM, Richard Wordingham wrote:

> > The issues is on the relative ordering of candrabindu and virama.
> > For a C1-conjoining form (i.e. C2 relatively unmodified),<la virama
> > candrabindu la> is easier to handle. For a C2-conjoining form,<la
> > candrabindu virama la> is easier to work with.
> Hmm -- perhaps you mean this is so because it would be possible to
> easily map Virama + LA to the C2-conjoining form?

That's my motivation.

> This is true
> enough, but it is advisable to have a single uniform representation
> across Indic scripts and that is LA + Virama + Candrabindu + LA
> (because of the reasons outlined by Peter and me in the previous
> mails I have linked to from the archives).

I can't think of any characters that can be viewed as decomposing in
some sense to consonant + virama. There are quite a few
characters that are functional equivalents to virama + consonant , and
some of these should be folded with virama + consonant in some

>> <snip>

> I know that and that is why I distinguish "Indian" Indic scripts and
> "non-Indian" (i.e. South East Asian [SEA]) Indic (i.e. Brahmic)
> scripts, especially in Unicode. It seems that at least in Khmer (I
> didn't check the other charts/chapters) one vocalic R/L vowel is
> represented by the independent vowel presented as a sub-base (which
> you call C2,...

This is not what I was talking about. The best relevant examples in TUS
6.0 Section 11.4 are the words for "both" and "already". The former
actually has nikahit + coeng!

> Hmmm -- I'm not sure I entirely grok the SEA situation with Thai/Tai
> Tham/Khmer etc, but I'm sure the handling of vowelless consonants and
> conjoining forms in those scripts does deviate from the *Indic*
> model. For example, see that stuff about the Balinese Surang and how
> it is handled...

Consider it a generalisation of anusvara! The Limbu and Lepcha have an
array of final consonants, formally divorced from initial consonants.
Kharoshti apparently used conjoining forms for final consonants, though
examples are few and TUS 5.0 says virama cannot follow a vowel. In
the Kharoshti script, the difference between a subscript MA and ANUSVARA
is slight to vanishing.

> > I've seen a claim that vowels within Tibetan consonant stacks can be
> > handled sensibly within the confines of Unicode - I didn't
> > investigate it.
> I don't understand what you mean by "vowels within Tibetan consonant
> stacks".

All I've got to go on is the penultimate sentence in TUS 6.0 Section
10.2 - 'Rarely, stacks are seen that contain more than one such
consonant-vowel combination in a vertical arrangement'.

> I also don't know whether Tibetan language written in
> Tibetan script requires the conjoining forms of vowels but I do know
> (to an extent) that Sanskrit written in Tibetan doesn't require
> "conjoining" forms of vowels per se generated by a virama-like
> character.

The Tibetan script doesn't have a combining virama. I would expect the
natural coding to be something like letter-vowel-subjoined
letter-vowel, e.g. <U+0F40 TIBETAN LETTER KA, U+0F74 TIBETAN VOWEL SIGN
formal analogue would be the Thai word <U+1A2F TAI THAM LETTER DA,
U+1A63 TAI THAM VOWLE SIGN AA>, but it doesn't match visually - its
second vowel goes to the right of the consonants.

