PR #96: Allowing Joiners in IDs: suggestion on B1 for ZWJ

From: Cibu C J (cibucj@gmail.com)
Date: Fri Jan 05 2007 - 16:43:08 CST


Sorry to cross-post this in indic list also.

The relevant text is below:
-----------------------------
B. ZWJ in the following contexts:In a conjunct context. That is, a
sequence of the form:

    * An Letter, followed by zero or more combining marks, followed by
a Virama, followed by a ZWJ, followed by zero or more combining marks,
followed by an Letter.
    * As a regular expression:

      /$L $M* $V ZWJ $M* $L/
      where:

          o $L = [:General_Category=Letter:]
          o $M = [:General_Category=Mark:]
          o $V = [:Canonical_Combining_Class=Virama:]
--------------------------------

This will not include the cases of Chillu letter being at the end of a
word. So B1 regular expression should be more inclusive and be:
/$L $M* $V ZWJ $M*/

BTW, I don't know about any combining markers in Malayalam. Does more
than zero $M make sense in case of Malayalam? I agree this is a
general regular expression and may be applicable in other scripts. I
was just wondering which are they.

Thanks
Cibu



This archive was generated by hypermail 2.1.5 : Thu Jan 18 2007 - 15:55:40 CST