Re: ZWJ, ZWNJ, CGJ and combination

From: Peter Kirk (
Date: Sun Nov 09 2003 - 16:13:26 EST

  • Next message: Simon Butcher: "RE: Hexadecimal digits?"

    On 09/11/2003 11:11, Mark Davis wrote:

    >Thus a combining character sequence *cannot* contain a ZWJ or any other Cf.
    >... Such a sequence would not correspond to anything used in a natural
    >► शिष्यादिच्छेत्पराजयम् ◄
    But does the Khmer script follow this rule? Please bear in mind that I
    know nothing about this script. But in TUS v4.0 10.4 p.281 I read:

    > Ordering of Syllable Components. The standard order of components in
    > an orthographic
    > syllable as expressed in BNF is
    > B {R | C} {S {R}}* {{Z} V} {O} {S}
    > where
    > B is a base character (consonant character, independent vowel character,
    > and so on)
    > R is a robat
    > C is a consonant shifter
    > S is a subscript consonant or independent vowel sign
    > V is a dependent vowel sign
    > Z is the zero width non-joiner
    > O is any other sign

    The first example given using ZWNJ, on p.282, starts with ba + ZWNJ +
    triisap + ii, i.e. <1794, ZWNJ, 17CA, 17B8>. 1794 is a base character
    (Lo), but 17CA and 17B8 are class 0 combining characters (Mn). The
    syntax implies that other Mn characters, e.g. robat, 17CC, may occur
    between the base character and the ZWNJ. So here is a case in natural
    language where ZWNJ may be both preceded and followed by combining
    characters, giving a technically defective combining sequence. Or have I
    misunderstood things here?

    Note that I am not proposing a change to Khmer, but just a clarification
    of definitions and the consistency of their application, and a good
    reason why what is allowed in Khmer would not be allowed in Hebrew.

    Peter Kirk (personal) (work)

    This archive was generated by hypermail 2.1.5 : Sun Nov 09 2003 - 16:41:48 EST