SE Asian Repha & Analogues (was: Colouring combining Marks)

From: Richard Wordingham (
Date: Mon Jun 20 2005 - 20:20:24 CDT

  • Next message: Raymond Mercier: "Re: Unicode in 2x2 square"

    Doug Ewell wrote:

    > I'd prefer to see "visual order" used to mean the direction generally
    > appropriate for the script -- LTR for Latin, RTL for Hebrew -- but
    > without reordering or other details that break the normal
    > directionality. "Logical order" would be similar, but with these
    > details added. I'm sure an expert in Bengali or Tamil or Khmer could
    > come up with suitable examples.

    'Logical order' is pretty much the same as phonetic order in these
    languages. The only exceptions I can think of are Khmer bantoc (U+17CB) and
    Khmer robat (U+17CC). Bantoc shortens the vowel of the *previous*
    orthographic syllable, so /cap/ 'finish' is <ca><ba><bantoc>. Robat
    represents /r/ at the start of a cluster, but in Unicode is written after
    the second consonant of the cluster. Visually it appears above. (Some
    would say that robat is a regrettable deviation from logical order.) It is
    curious that both Bengali and Khmer can write the cluster -ry- with a
    subscript ya/yo although the normal way of writing <r> at the start of
    cluster is repha (U+17CC robat in Khmer).

    Do such vagaries occur with Burmese nga? In a consonant cluster an initial
    nga is rendered as a superscript mark called kinzi. It also occurs in Old
    Tai Lue / Lanna Tai. However, I have a text book for the Lanna script that
    gives an example of /ngiw/ with tone mark 2 written as what on the Burmese
    model would be encoded something like <nga><virama><wa><dependent i><tone
    mark 2>. Presumably a ZWNJ would be required before the virama to stop
    <nga> being rendered as the kinzi-like form, though I don't believe the
    kinzi form occurs word-initially. (Word initial ngw- does occur in the
    Northern Thai language, but I don't have any spellings for such words.) I
    must say I like the use of special Khmer character 'coeng' instead of virama
    to form consonant stacks.


    This archive was generated by hypermail 2.1.5 : Mon Jun 20 2005 - 20:22:33 CDT