Re: latin equivalent to specific indian characters

From: Antoine Leca (
Date: Sun Dec 05 2004 - 11:15:23 CST

  • Next message: Marcin 'Qrczak' Kowalczyk: "Re: Nicest UTF"

    I fail to see the connection between your question and Unicode.

    Samedi 4 décembre 2004 13:18Z, Rene Hache écrivit:

    > To whom it may concern,


    > I writing because I would to know if someone can help with certain
    > Sanskrit/Pali characters in roman scripts.

    Certainly there is a LOT of material this about around the net. Google is
    certainly the best answer one can give to you.
    As second level helper, it is my believeing that you will encounter more
    material using Sanskrit as keyword than with Pali. This should not mislead
    you: as always with Google and co., more material means overall more wrong
    ways to check.

    > Most characters are simple, like vowels with macrons, or some letters
    > that have either a dot below or above.

    If you want to see things this way, you should try a coded character set
    that fit this description. Fortunately, such a thing exists, and a good
    choice could be IS 13194:1991 widely know as ISCII; in this coded character
    set, dha is only one codepoint (namely C5). ISCII is a good choice because
    you can easily print it using ad hoc software (CDAC is a good keyword here),
    and also because you can somewhat easily map from or to Unicode. Of course
    collation, and translitteration to Nagari or other script used in India is
    trivial, they were objectives of the design.

    On the other hand, if you want to handle the textual material in Unicode (if
    not, I really cannot see why you are asking this here), you will have to use
    a not straightforward yet perfectly possible collating process. The fact
    that dha is a single "letter" is not a real problem (this is a simple
    contraction, any not stupid algorithm should offer this), more interesting
    things appear when you realise that while dha is one letter, dhi are two.
    Even more interesting is that in traditional order, ã (nasalisation noted
    with candrabindu) precedes a (without nasalisation). And real complexity
    begins when you study the rules to collate the anusvara (written as a dot
    above in Nagari script, and which can stand for itself of for a nasal of the
    following consonant).


    This archive was generated by hypermail 2.1.5 : Sun Dec 05 2004 - 12:55:06 CST