Re: DIY OpenType Re-ordering

From: Philippe Verdy (
Date: Fri Mar 24 2006 - 21:27:41 CST

  • Next message: Richard Wordingham: "Re: AA versus TALL AA"

    From: "Antoine Leca" <>
    > Philippe Verdy wrote:
    >> 919, 94D : dead "NG" (NGA + VIRAMA):
    > More exactly, "dead NGA", or "NGAd". OK.
    > So you agree this is one, identificated, combing character sequences (KI
    > being another).

    I did not say that at that point. This is just what you can say by just looking at those two first characters. The rest needs to be parsed to get the final rendering.

    >> uses full NGA with subjoined halant by default,
    > ^^^^^^^^^^

    Don't protest too soon. This is further discussed. The "by default" just means that before the rest of the string is parsed and known to exist, this is temporarily the default form. But the rest is parsed.

    >> or half-form NGA if followed by consonnant,
    > ^^^^^^^^^^^^^
    > Beg your pardon, but I do not know such a thing, neither what could be a
    > "half-consonant form of NGA," NGAh.

    I wanted to discuss the possibility of other Indic scripts scripts at the same time. But also the fact that Devanagari is used byh so many languages that I am not sure that one does not have a half-form for NGA.

    >> or half-form NGA, unless ligatured with next consonnant.
    > Cannot make sense of this.

    There's an undesired repetition of the terms "or half-form NGA,", and the "unless applies to the text before it. I suppose you should have seen it, as it was obviously a cut-paste error when reordering the sentence at edit time (otherwise of course the expression "or half-form NGA, or half-form NGA" does not makes sense, as they are identical).

    >> 915, 93F : live "KI" (KA + I): uses full-KA form, unless ligatured
    >> with previous consonnant.
    > Live KA, or KAl, which has the nominal form of KA, + Ivs; or Ln + Ivs. Yes.
    > And here is precisely the point: the rules for Devanagari are based upon the
    > ligatures (or not) between the various consonants in a cluster, which appear
    > to group several combining character sequences. At such, introducing a rule
    > speaking about c.c.s. to define reorderings does not seem perfect to me, it
    > merely confuse things, IMHO.
    > Or it requires to define a whole new concept, here "something which
    > *applies* to a (or several) c.c.s."
    > No definition for "to apply" found, back to square one.
    >> The choice between full-form NGA with subjoined halant or half-form
    >> depends on locale,
    > Well, Mr. Karlsonn was asserting in a previous post that any possible
    > variation here would depend on
    > A *spelling* difference that should be recorded in the
    > sequence of characters (in some, not yet standardised,
    > way), quite apart from font issues.
    > Sorry, but I see a very big difference between the two positions.
    >> but by default NGA and KA do not ligate,
    > ^^^^^^^^^^
    > More huh!?
    > Please send me the quote from your official book about Nagari (Sanskrit)
    > Manual of Style saying so.
    > Please do you point me toward Hindi material. Only Sanskrit.
    > (I am hearing that Hindi will not use NG in this position, rather anuswar on
    > the preceding consonant; but I cannot be completely affirmative here, so
    > please double check this information, or give me authoritative statements;
    > thanks in advance.)

    Anusvara can't be used to represent a prenasalisation before the FIRST consonnant of a word (where would you put it? On the previous word?). I was told that anusvara marks a postnasalisation of a syllable or the nazalisation of its vowel.

    If you use the sanscrit rule, then how would you transcrit in Devanagari the phonetic of the French term "encre" (the first akshara would need to be for the syllable "en-" which is a vowel letter with anusvara nazalisation with no prenazalisation of the next syllalbe); and then the English term "anchor" (the anusvara would not be appropriate here as it would nazalize the "an-" syllable instead of marking the N consonnant, which then should better be transcripted with a dead NGA that marks the prenazalisation of the next live syllable "-chor")

    >> If you insert a ZWJ in the middle [...]
    >> Anyway, the vowel I is still reordered before the half-form NGA.
    > How does it links with c.c.s.?
    > Why is it different among Indic scripts?
    >> If you insert a ZWNJ, it blocks the reordering of I before dead NGA,
    > How does it links with c.c.s., and "to apply"?

    I'm not sure about what you mean by "c.c.s.". If it is a "combining character sequence" then it is a Unicode term and this is not appropriate here. We are speaking of "grapheme clusters".
    My opinion is that the behavior of ZWNJ (that blocks a reordering) is mandatory. But the absence of ZWNJ (or the use of ZWJ) does not forbid blocking the reordering of I before a dead letter with halant. The default is just that the i-matra will try to move the most possible to the left of the consonnant cluster.

    But a specific locale may have its own breaking rules withina consonnant cluster to create two aksharas, so inthat case, ZWNJ is not strictly needed for that language because it's implied by the presence of a dead letter marked with halant. In Devanagari, there are not a lot of dead letters that appear to have a (full-form + halant) form, just a few like NGA. It's easy to detext them: they have no danda (or "half-danda" for KA) on the right side of their full form.

    But Marathi has one exception and it gets encoded specifically in the Devanagari block (the half form of the letter is to remove the small vertical stem that attachs it to the horizontal joining line). Given this possibility in Marathi, I would not be surprised that there exists one in the many locales using the Devanagari script where NGA also uses to the same technic for creating a half-form for dead NGA... So I can't be very affirmative when describing it. If so, it does not need a (full-form+halant) representation, it does not create a separate akshra, and the I-matra will slide normally to its right, even when the dead NGA marks only a prenasalisation of the next consonnant and is then part of the same syllable.

    I see here the same differences in IPA with the baseline Latin ENG letter used at end of a syllable like "camping" (equivalent to a final dead NGA), the IPA combining tilde used to translate the anusvara (vocalic nazalisation), and the superscript ENG used before a occlusive consonnant (voiced or not voiced) to mark its prenazalisation. Sanskrit may not have a clear distinction between the two first cases, so it uses anusava also in the first case (dead NGA). Iknow that some regional accents also pronounce the nazalized vowels marked by anusara with a additional "-n" or "-ng" postnasalized consonnant. This confusion with the use of anusvara is quite common, and is commonly used by publishers that replace the pure vocalic modifier "candrabindu" by anusvara...

    This archive was generated by hypermail 2.1.5 : Fri Mar 24 2006 - 21:30:16 CST