    On 2008.10.27, 06:10, Karl Pentzlin <> wrote:

    > also for letters with any kind of "fixed" appendages which are not
    > attached simply at the bottom of a letter (like ogonek or cedilla).

    Even cases of oddly positioned "appendages", like U+0104 LATIN CAPITAL
    LETTER A WITH OGONEK, there is canonical compatibility (U+0041 U+0328) —
    it was only recently that connected and oversticken discriticals were
    encoded as separate character without canonical compatibility.

    This is clear in the more recently added cyrillic characters — going to
    the point that U+00E7 LATIN SMALL LETTER C WITH CEDILLA is canonically
    equivalent to U+0063 U+0327 and yet U+04AB CYRILLIC SMALL LETTER ES WITH
    DESCENDER is not, although in the real world these two were the very same
    lead type, taken from French sorts (later lynotype films).

    > This is something like the Arabic encoding model, where a model based
    > on ghost characters + combining marks could have been selected but in
    > fact was not.

    This is absolutely not the case. Nobody is saying that "Q" should be made
    canonically equivalent to U+004F U+0330 or some such.

    Waht is being argumented is that, for some reason, new Latin letter +
    diacritical pairs were not accepted as new characters for a long time
    (and rightly so), but recently an exception was made for connecting and
    oversticker characters.

    This seems to be a give-in to some technological problem, not to (as it
    should) to an actualy encoding philosophy improvement.

    In view of this, Karl's proposal should be accepted, of course, but the
    lack of compatibility for all these characters, and its assymetry with
    older cases (like the mentioned U+0104), still bugs me.

