Re: Myanmar Ordering of Syllable Components vs. canonical order

From: Philippe Verdy <>
Date: Fri, 6 Sep 2013 20:17:41 +0200

The canonical order carries absolutely no semantic meaning except to
separate classes of combining characters within which the order is
significant. When canonica combining classes are different, you cannot
imply any logical order between these diacritics, and you need a collation
tailoring to determine the logical order, or need to insert additional
controls between them to encode their order when both order are possible
ans semantically different.

This is not exceptional, and the Hebrew script for example has such complex
casesfor some diacritics that were historically given a non-zero combining
class, correctly distinct from othernon-zero cobmining classes (but these
diacritics should have probably used a zero combining class). You cannot
solve it using only canonical order whose only intent is to convey the
canonical equivalences.

The relative numeric value of distinct non-zero combining classes means
nothing linguistically. All that matters is that they are non-zero and
distinct or not. In other words, the combining classes have NO order,
except for normalization.

2013/9/6 Markus Scherer <>

> Unicode 6.2 chapter 11<>.3
> Myanmar, Table 11-3. Myanmar Syllabic Structure, shows that 103A asat sign
> comes before 1037 dot below. However, 1037 has ccc=7 which comes before (in
> canonical order) 103A which has ccc=9.
> Is it correct that Unicode normalization of Myanmar text moves characters
> out of the order in table 11-3?
> If so, should there be a note about this in the text? (Sorry if I just
> missed it.)
> markus
Received on Fri Sep 06 2013 - 13:21:27 CDT

This archive was generated by hypermail 2.2.0 : Fri Sep 06 2013 - 13:21:31 CDT