RE: Ya-phalaa

From: Michael Everson (
Date: Wed Mar 05 2003 - 15:19:26 EST

  • Next message: Andy White: "Malayalam Cillaksharams (was Ya-phalaa)"

    Andy, the ya-phalaa is a presentation form of cojoined YA, which is
    produced in Unicode by the sequence VIRAMA + YA. Encoding it as
    anything else makes very little sense at all. However it is
    pronounced today in Bengali, and however weird you feel about its
    being applied to an initial vowel, the fact is that it is still a
    presentation form of cojoined YA, and it should be encoded as such.

    Consider the fact that the Bhagavadgita is available in Sanskrit in
    Bengali script. This will certainly contain many, many examples of
    consonant clusters in -YA. These will all be encoded as VIRAMA + YA,
    not as some independent form of ya-phalaa.

    It is easy to point fingers about a mismatch that someone like me
    makes, but the Unicode encoding model for Indic scripts is very
    robust, and we do our best to apply it correctly.

    Your proposed combining ya-phalaa will do Bengali no service, as it
    will introduce multiple spellings for consonant clusters in -YA. I
    have already stated on this forum:

    "For example, in Sanskrit and Bengali, we have the word pratyeka
    'each, every'. This is derived from the Sanskrit root prati
    (expressing likeness or comparison) plus eka 'one'. In Sanskrit
    orthography i + e becomes ye and is so written. Now in Bengali this
    word also exists and in both languages what is written is PA + VIRAMA
    + RA + TA + VIRAMA + YA + E + KA."

    It would be absurd -- and wrong -- to spell the Sanskrit word one way
    and the Bengali word another, especially as it is the same word.

    >IMHO, TUS needs solid rules; Exceptions, hacks, patches, or workarounds
    >should definitely be avoided wherever possible. (If you care to look
    >back in the mailing list archives a few years, you will see that the
    >"a+Virama+Ya+aa" kludge was originally proposed as a workaround due to
    >the lack of a separate encoded letter)

    It isn't a kludge. It is a consistent application of the rules.
    Ya-phalaa is a presentation form of YA in conjunction with a
    preceding consonant or -- a Bengali innovation -- an independent

    In keeping this stance, Andy, I am defending the Unicode Standards
    encoding principles. The Indic encoding model is constantly under
    attack from people who want explicit rephas, explicit half-forms,
    explicit ya-phalaas, and all sorts of other explicit things, which
    were we to encode them would make the standard very much worse than
    it is.

    To reiterate our consistency in using this model, I will give you a
    Malayalam example.

    NA + VIRAMA + MA --> NMA (a single conjunct)
    NA + VIRAMA + ZWNJ + MA --> NMA (with a visible virama breve above and between)
    NA + VIRAMA + ZWJ + MA --> NMA (with the cillaks.aram virama curl)

    We prefer to apply this consistency to Bengali as well. Thank you for
    correcting my error earlier. That kind of feedback is helpful.
    Beating us up because you don't like our encoding model isn't.

    Michael Everson * * Everson Typography *  *

    This archive was generated by hypermail 2.1.5 : Wed Mar 05 2003 - 16:04:51 EST