Re: NUKTA

From: James E. Agenbroad (jage@loc.gov)
Date: Thu Aug 24 2000 - 11:46:55 EDT


On Wed, 23 Aug 2000, Jaap Pranger wrote:

> At 18:05 +0200 2000.08.23, James E. Agenbroad wrote:
>
>
> >In a list of Devanagari conjuncts if compiled a while ago there are at
> >least two cases of conjuncts in which both consonants have a nukta:
> >1. Ka + nukta + halant + ka + nukta = qqa
> >2. Ka + nukta + hanant + pha + nukta = qfa
> >
     That should say "A list of conjuncts I compiled a while ago" sorry.

> >I think:
> > 1. Any consonant can have a nukta. But if a Unicode character includes a
> >precomposed nukta, U+0929, 0931, 0934 and 0958 through 095F, and has a
> >another nukta, U+093C, following it, I'd ignore the second nukta during
> >rendering.
>
> Mac typing behaviour (as far as I can see) is a bit different in that you
> can't type the precomposed nuqta characters with a single keystroke,
> and when you type a nuqta where you should not (e.g. as a second nuqta,
> or after a base character that shouldn't have a nuqta), the rendering
> translates your (faulty) typing into a clearly visible spacing character.
> (This spacing char is also a nuqta, but down below the baseline.)
>
> I think ignoring an erroneously typed char during rendering is not
> a good thing. Is rendering faulty data correctly not *as* bad as
> rendering correct data incorrectly?
>
     You have a point. What to do? I guess two dots below in a
horizontal line would be better than two vertically. Perhaps some
distinctions need to be made here: 1. Between rendering text
dynamically as keyed where the syllable boundary is not always known, and
rendering a more or less fixed text where syllable boundaries can be
determined. ('more or less' here is meant to suggest a possible further
distinction between rendering for proofreading and correction and
rendering with no chance to alter the text.)
 
> >Whether a vowel or vowel sign can have a nukta I do not know. >
> Don't think so.
>
     In another posting Mr. Leca says ISCII 91 uses nukta with both vowel
and vowel signs to input certain uncommon cases. So I guess it would be
safer to allow them, presumably before U+0901 to U+0903. It would help to
know if these are just input conventions or are also how long vocalic rii
and both vocalin li and lii are stored too.
>
> > 2. A nukta should immediately follow a consonant--before a halant or
> >vowel sign or 'various signs' = candrabindu, anusvara, visarga = U+901 to
> >U+903 only.
> > 3. These 'various signs' should follow a nukta, vowel sign (or
> >halant?). I'm unsure if one of these 'various signs' after a halant
> >would be valid; I doubt if 'various sign' followed by halant is.
>
> No 'various signs' after halant, and no 'various signs' followed by
> halant, I would say.
>
     Good.
 
> 'Nuktated' consonants always (?) belong to Urdu words, and visarga
> "occurs almost exclusively in Sanskrit loanwords", thus the occurce
> of nuqta followed by visarga is highly unlikely, or non-existent.
> (I don't consider U+095C, U+095D and U+095F as nuktated; should I?)
>
     Well, Unicode 3.0 page 403 does say they are identical to the base
character followed by nukta.

> > 6. [...........] a vowel sign immediately after a vowel is unlikely.
>
> Yess, there is a reason for U+0906, the <0905><093E> sequence for
> instance is invalid I guess.
>
      Yes indeed. Mr. Leca points out that ISCII uses two halants to
mean an 'explicit halant'--one not to be replaced by a more complicated
conjunct. I guess I prefer the Unicode ZWJ.
>
> > 7. Unicode 3.0 fig. 9-3 (4) to the contrary notwithstanding, halant
> >immediately followed by a vowel sign or an independent vowel is highly
> >questionable--just consonant + vowel sign would seem preferable.
>
> I would like to know in which word(s) this 'rare' sequence occurs,
> in Sanskrit?
>
     On page 554 (middle of second column) of Monier Williams
Sanskrit-English Dictonary there are three words beginning 'nirr.' where
'r.' is the vocalic ri. It displays as the independent ri vowel with a
reph above it; but they are filed after 'niru' and before 'nire' which
strongly suggests to me that the ra consonant + ri vowel sign are present
but display strangely. The first means to go out or fall away from; the
second to go asunder or pass away; the last to let out, deliver. The
first has several related words and citations. There may be others, I only
know of these; does anyone know if there is an automated version of
this work (like the OED) that one could search for all occurences of ra
(consonant) + ri (vowel sign)?

> Also, the explanatory text: "When an independent vowel appears ... ...
> ... ..., the indepent vowel should not be depicted as a dependent vowel
> sign, but as an independent vowel letterform", is a bit beyond me.
>
     And me. I tried to get the fourth example changed but I failed
because I couldn't point to ISCII practice for this--it says nothing about
this. I take 'appears' to mean 'displays' or 'is desired to dispaly as'
and 'depicted' to mean is 'encoded'. My preference would be ra + ri vowel
sign, U+0930 and U+0943 with no halant and leave it for rendering
software to deal with. I was able to get ISCII treatment of Marathi
'eyelash ra' into 3.0.
>
> >All the above IMHO.
>
> the same for me,
>
> Jaap
>
     If my suggested list of the expected order of certain Devanagari
codes has any utility would anyone like to restate it in light of the
comments on it?--preferably (to me at least) in a not too algebraic form.

     Regards,
          Jim Agenbroad ( jage@LOC.gov )
     The above are purely personal opinions, not necessarily the official
views of any government or any agency of any.
Phone: 202 707-9612; Fax: 202 707-0955; US mail: I.T.S. Dev.Gp.4, Library
of Congress, 101 Independence Ave. SE, Washington, D.C. 20540-9334 U.S.A.



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:13 EDT