PRI #82

From: Peter Constable (petercon@microsoft.com)
Date: Tue Dec 13 2005 - 14:22:24 CST

  • Next message: Rick McGowan: "Public Review Issues Update: UAX #9 Bidi Algorithm"

    Re ordering of multiple vowels in Indic scripts, Uniscribe will allow multiple vowel marks to appear on a single consonant -- one each in left, above, below and right positions -- provided they are entered in *that* order.

    For common cases of multiple marks, there is also a pre-composed composite-matra character encoded, so for most data this issue won't arise. But we will handle data in which these are decomposed, and we will also handle data that includes other combinations of vowel marks, with the constraint mentioned above. If marks are encountered in a different order, then we will break the cluster at the point at which that order breaks down. E.g. (cluster break indicated by |):

    C Vmk-above | Vmk-left

    The dangling vowel marks after the cluster break will display on a dotted circle.

    According to the definition of "combining mark sequence", we are failing to treat some sequences as single combining mark sequences that are such by definition. On the other hand, since the Indic vowel marks all have canonical combining classes of 0, the different orders of marks are significant wrt normalization, and so we think it best to consider the difference semantically significant and therefore to display them differently. But the only possible way to differentiate in terms of display ŭdifferent orders of marks that don't interact typographically is to treat one as a valid combination -- i.e. valid as a single cluster -- and display it in the appropriate manner but to consider the other as not valid as a single cluster and break it into multiple clusters.

    Peter Constable



    This archive was generated by hypermail 2.1.5 : Tue Dec 13 2005 - 14:41:04 CST