From: Richard Wordingham (richard.wordingham@ntlworld.com)
Date: Sun Sep 04 2005 - 08:07:58 CDT
Kent Karlsson wrote:
> Richard Wordingham wrote:
>
>> It has become clear from some curious cases, such as
>> Devanagari TTA + VIRAMA
>> + TTHA + I with the Mangal font, that the orthographic
>> syllable can depend
>> on the font, and not simply on the characters. As there is
>> no ligature for
>> TTA and TTHA and no half-form for TTA, this sequence is two
>> orthographic
>> syllables - TTA + VIRAMA and TTHA + I.
>
> Then there are two orthographic syllables here, per definition.
> If there in addition is any ligating between adjancent
> orthographic syllables, then that is a separate issue.
> Are you claiming that reordering may take place over more
> than one orthographic syllable? If so, that should be carried
> in the underlying text somehow. It should not be a font
> dependence, as this is would clearly be an orthographic difference.
The points are that:
(1) With the Mangal font, TTA + VIRAMA + TTHA + I should be two orthographic
syllables because of the font's conjunct repertoire, i.e. as the virama must
be visible, the vowel should attach to the TTHA. This special case (obvious
to a human) was missed in rule R15 in Section 9.1 of the Unicode standard,
perhaps because the case of no combination was omitted from Figure 9-3.
Uniscribe follows the standard literally in this case, with result that for
Mangal one sees the two orthographic syllables TTA + VIRAMA + I (I hope i
have the order right) and TTHA, which is nonsense.
(2) On the other hand, with the Code 2000 font, TTA + VIRAMA + TTHA + I is a
single orthographic syllable. (The TTA.TTHA ligature is shown in the
Unicode standard in Table 9-2.)
(3) Devanagari TA + VIRAMA + THA + I should be and is a single orthographic
syllable in both fonts.
(4) This has already been discussed by others on the Indic list, so I don't
think it is down to me to make a formal error report.
>> Do you yet have any examples of R1/R2 as opposed to R3/R4?
>
> R3/R4 would be used for (e.g.) Devanagari; see rule R15 (not to be
> confused with the property value names I suggested) on page 228 of
> TUS4. I'm not sure about R1/R2, and I'll leave that to be answered
> by someone more familiar with the Indic scripts than I am.
It also applies to the virama-model South Indian scripts I'm acquainted
with - Tamil, Burmese and Khmer. In theory, it's the lack of conjuncts in
the font that makes Tamil seem different, but I wouldn't be surprised if a
renderer only considered KSSA as a possible conjunct in Tamil. (I presume
the SHRI ligature is handled differently.)
>> I haven't seen any cases where they break consonant-vowel
>> ligatures.
>
> Some posted scans would be nice...
Sorry, I only have examples from the Internet :) As I posted this week on
the Indic list:
"Examples of OO can be found in verses 8 and 24 at
http://www.prapatti.com/slokas/tam2/dayaasaagarashatakam.pdf . An example
of /dadau/ can be seen at the end of the first line in verse 3 at
http://www.prapatti.com/slokas/tam2/aadivanshatakopamangalam.pdf . (Replace
'tam2' by 'english' for a transliteration into the Roman alphabet.)"
"However, you will have to explain away texts like
ttp://www.srivaishnavam.com/stotras/sgadya_tamil.pdf , which has many
examples of the first ligature, e.g. /pra.naamam/ at the start of the 3rd
paragraph. Interestingly, this text puts the superscript at the *end* of
the akshara, e.g. after vowel sign AA. Or is this not actually in the Tamil
script? I notice that it has visargas."
>> theory subscripts shouldn't be any worse than nuktas, and for simple
>> conjuncts, as in Brahmi, I think they shouldn't break the
>> conjuncts. After
>> all, they wouldn't present any problems when writing by hand.
>
> Where would you then display them? Inside in the middle of the
> conjunct somewhere? Again, presenting actual existing examples
> would be nice; if any exist.
For Brahmi, Khmer and Dai Lanna, the conjuncts are generally not ligated.
There are a few specific issues, but they go away if one is allowed to
substitute a superscript for a subscript and vice versa. Automating their
placement would be more complicated - the ascender of a subscript would have
to be moved away from the superscript of the base consonant.
Richard.
This archive was generated by hypermail 2.1.5 : Sun Sep 04 2005 - 08:13:12 CDT