Re: Umlaut and Tréma, was: Variation selectors and vowel marks

From: Doug Ewell (
Date: Tue Jul 13 2004 - 13:35:37 CDT

    Peter Kirk <peterkirk at qaya dot org> wrote:

    > But now it seems that WG2, and apparently also the UTC, has decided to
    > accept an encoding using CGJ as a pseudo-variation selector applied to
    > a combining mark (although positioned before it instead of after it),
    > despite it having all of the effects of confusing normalisation which
    > Asmus describes so clearly above - which are even worse in this case
    > because of canonical equivalences. (In practice the new combination
    > for tréma may be used very rarely in combination with other combining
    > marks, but that argument didn't wash before.) The encoding using CGJ
    > also seems to be overloading this character which is intended for
    > something quite different.

    Read N2819 again:

    "The sequences <a, ¨> and <a, CGJ, ¨> are not canonically equivalent.
    [T]his means that the distinction will not be normalized away on
    conversion in and out of bibliographic systems."

    CGJ + COMBINING DIAERESIS is a hack, but then again the need to draw a
    distinction between the exact same combining mark used for two different
    phonetic purposes is a bit of a hack too.

    The alternative proposed by DIN, creating a new COMBINING UMLAUT
    character, would have caused *unprecedented and catastrophic*
    equivalence and normalization problems.

    > It seems to me that the UTC should bite the bullet and accept that
    > there is a need for variation sequences for combining marks, and
    > either adjust the definitions of existing variation selectors or
    > encode new specialised variation selectors for them. The adjusted or
    > new variation selectors can then be used for Hebrew as well as for
    > German - see my posting on this subject to the Hebrew list.

    "When 256 variation selectors just won't do, invent another."
    (with apologies to Ken Whistler)

    -Doug Ewell
     Fullerton, California

