RE: conjucts beginning with independent vowel?

From: Peter_Constable@sil.org
Date: Thu Jan 18 2001 - 12:03:01 EST


On 01/18/2001 10:38:34 AM Marco Cimarosti wrote:

>I wonder then, how about renaming "Syloti Nagri Ng" as "Syloti Nagri
>anusvara"? And possibly encode it in position U+XX02, for analogy with
other
>blocks?
>
>Anusvara is in fact the sign to indicate a nasal sound after a vowel, and
it
>is also applicable to independent vowels. BTW, AFAIK, Bengali anusvara has
>precisely the same "NG" sound, so it might act as a precedent (hmmm...
sorry
>if this had already been said).

Yes, I hadn't noticed it before, but in one of the documents my contacts
sent me, they did indicate that this corresponds to anusvara from Bengali
and Gujerati. Quoting (forgive the few characters for which you won't have
the right fonts):

<quote>
(5) The Sylheti anuswar w is written over the syllable just like the
Bengali chandrabindu u or Gujerati anuswar ¢ which serve to nasalise
the syllable. But it is pronounced ng like the Bengali anuswar s. (mô). It
corresponds to the Kaithi (Embedded image moved to file: pic06334.pcx)
.
</quote>

>One could also dare to propose that "Syloti Nagri consonant sign So" be
>renamed "Syloti Nagri visarga" (U+XX03), on the basis that "S" is
>etymological sound of visarga in Sanskrit.

That would not be particularly well motivated since the proposed consonant
sign So is clearly a conjoined form of So and is clearly *not* related to
Sanskrit visarga. There's not reason to treat this case any differently
than To, Ro and Lo.

>What impedes you to represents these VC clusters as V+C+virama (notice:
with
>virama *after* the consonant, not before)?
>
>This would be the logical representation, and would deliver the correct
>phonetic information (for the benefit of people or application interested
in
>sounds; e.g. linguists or screen readers).

Well, it may seem logical. But, it would not benefit readers since they
would never see any overt virama after the C. Furthermore, this not result
in a VC conjunct in any existing implementation models, and it could
incorrectly lead to conjuct formation with a following C. So, it would
really need to be V + C + virama + ZWNJ, but even then you don't get the
desired rendering in existing implementation models.

>Of course, a sequence like V+C+virama would be quite a new case for
>smart-font technologies (but wouldn't break Graphite, would it?).

Correct.

>My question is: would such an encoding sequence be unambiguous in this
>script, or can it be confused with other kinds of clusters and/or with
>visible viramas?

Yes, there would be such confusion. V + virama + C seems the way to go for
To, Ro, Lo and So, and Ng is evidently a cognate of Gujerati anusvara, and
so should be treated as a combining mark.

Thanks, everyone, for the feedback. I think this is resolved to my
satisfaction.

- Peter

---------------------------------------------------------------------------
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: <peter_constable@sil.org>





This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:18 EDT