Three character canonical decompositions in version 2 releases

From: Karl Williamson <>
Date: Tue, 03 Apr 2012 19:57:44 -0600 says that effective
starting in Version 2.0, "Canonical mappings (Decomposition_Mapping
property values) are always limited either to a single value or to a
pair. The second character in the pair cannot itself have a canonical

I noticed that the UnicodeData.txt file shipped with all the Version 2
Unicodes have three character canononical decompositions. For example
in 2.1.9, there are these:
  01E0;0041 0307 0304
  01E1;0061 0307 0304
  1E1C;0045 0327 0306
  1E1D;0065 0327 0306

There are many more in 2.0.

Is it an error on the web site that this policy was in effect in 2.0,
and it really should be 3.0? (as there no such decompositions in the
data files starting in 3.0).

Or were these data files defective?
Received on Tue Apr 03 2012 - 20:59:18 CDT

This archive was generated by hypermail 2.2.0 : Tue Apr 03 2012 - 20:59:18 CDT