Kenneth Whistler wrote:
>>> Decompositions are done for the *characters*, not for
>>> the *glyphs*. And for Latin uppercase/lowercase pairs, it is an
>>> ironclad rule that a canonical decompositions must involve the same sequence
>>> of combining marks.
>> This is doubtless a fine rule, but (like all fine rules) ought to be
>> documented somewhere. I have never heard it before.
> Well, we considered this self-evident, but apparently it is not.
The trouble is, I think, that combining marks are rather glyphic
in nature, with distinct codepoints that do not necessarily represent
distinct characters. Until I heard about the SMALL G WITH CEDILLA
case, it would never have occurred to me that a CEDILLA
could sometimes be represented by a glyph *above* the base character.
BTW, this possibility seems to play hob with the Canonical Ordering
Behavior rules, since LATIN SMALL LETTER G + COMBINING CEDILLA +
COMBINING MACRON (or any other superior accent) would appear to
allow rendering with the squiggle either above the bar or below it.
The difference cannot be expressed by whether CEDILLA comes before
or after MACRON, since they are class 202 and class 230 respectively.
-- John Cowan http://www.ccil.org/~cowan email@example.com You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5)
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:42 EDT