RE: New property for reordrant dependent vowels reordering?

From: Kent Karlsson (
Date: Mon Sep 05 2005 - 03:46:02 CDT

  • Next message: Anto'nio Martins-Tuva'lkin: "Re: Punctuation character proposed"
  • Next message: Anto'nio Martins-Tuva'lkin: "Re: ASCII and Unicode lifespan"

    Richard Wordingham wrote:

    > This is definitely not true for Burmese, where the 'native'
    > spelling does
    > not allow orthographic syllables to straddle phonetic
    > syllable boundaries.
    > CVCDV..., where D is an oral stop consonant, is split into three
    > orthographic syllables, CV, C+visible virama, DV. The split
    > CV, CDV is a
    > mark of foreign (chiefly Pali) origin. As with other Pallava
    > scripts,
    > several Burmese matras appear on the far left, even in CCV syllables.
    > The formulation above is the sort of thing that makes users
    > complain that
    > this use of virama is unnatural. To me, a more natural
    > formulation is:
    > <consonant, {combining marks, at least one of which is a conjoiner}>*
    > <consonant, {maybe combining marks or visible virama, no conjoiner}>

    Whether a virama is visible or not (absorbed into a half form or a conjunct)
    is in general font dependent, the above is not a good criterion for
    orthographic syllables. You really need a character based criterion, which
    is font independent.

    > > You mean that the consonants in the orthographic syllables do not
    > > generally form conjuncts/ligatures...
    > No! I was in general wrong when I said the font determined
    > the conjuncts.
    > For most Indian scripts the font does, but it does not for
    > Brahmi, Burmese
    > or Khmer and I think not for Tibetan and Dai Lanna (at least,
    > when vowels do
    > not interpose). The primitive method of forming conjuncts is
    > just to stack
    > the consonants vertically,

    That's not a conjunct, that's a stack ;-) We are obviously using
    different terminology here. When I wrote "conjunct" read "conjunct
    form" (and look that up in the TUS4 glossary).

    > > That is, as Eric Muller wrote, then two *orthographic* conventions.
    > These are not the two Eric Muller spoke of. We are talking of three
    > conventions where half-forms are not available. In

    Again, this is in general font dependent.
    > Devanagari visual order
    > they are:
    > 1) <i da virama dha>
    > 2) <da virama i dha>
    > 3) <i d.dha>
    > Peter is referring to all three; Eric Muller to forms (2) and (3).

    There is a standard way of distinquishing (1) and (3), by the use of
    ZW(N)J just after the virama character; the default (no ZW(N)J present) is
    font dependent between (1) and (3). There is no standard way of
    getting (2).

    > > These must be *reliably* be distinguished in the underlying text.
    > > It must NOT be font dependent (for properly constructed fonts).
    > This would be unreasonable if you are referring to (2) v.
    > (3). You would be
    > requiring that for each *language* all Devanagari fonts have the same
    > language-dependent repertoire of conjuncts.

    Eh, no. I don't think I have said anything requiring that. See above.

    > With Uniscribe and Mangal 1.20, that currently yields <i tta
    > virama ttha>.
    > In Windows Vista, this is to be overridable, I presume by
    > feature selection.

    We really need a character based standard way of selecting between
    these. Leaving it entirely implementation and font dependent will
    result in apparent spell changes between different platforms/fonts.
    As these are, to the eye, spell changes, there really need to be a
    character based difference, and a standardised one.

    > > < TTA, VIRAMA, ZWNJ, TTHA, ZWJ, I > -- I before TTHA, with visible
    > > virama
    > I believe the ZWJ should currently be redundant, and for
    > consistency with
    > workable Burmese should remain so. With the Uniscribe I'm using, it
    > actually forces a new cluster and thus generates the dotted circle.
    > > < TTA, VIRAMA, TTHA, ZWJ, I > -- I before TTHA, with visible virama
    > Same Uniscribe problem as above.
    > I'm happier with the current Uniscribe schemes:
    > <TTA, I, VIRAMA, ZWNJ, TTHA> yields vowel on the left - टि्‍ठ.
    > <TTA, VIRAMA, ZWNJ, TTHA, I> yields vowel in the middle - ट्‍ठि.

    I'm not happy to leave this to be entirely platform/font dependent.

    I'm not very keen on exactly which way these are differentiated,
    but I am keen on that a character based (font independent) differentiation
    should be standardised.

                    /kent k

    This archive was generated by hypermail 2.1.5 : Mon Sep 05 2005 - 03:56:54 CDT