From: Doug Ewell (dewell@adelphia.net)
Date: Tue Jul 13 2004 - 13:35:37 CDT
Peter Kirk <peterkirk at qaya dot org> wrote:
> But now it seems that WG2, and apparently also the UTC, has decided to
> accept an encoding using CGJ as a pseudo-variation selector applied to
> a combining mark (although positioned before it instead of after it),
> despite it having all of the effects of confusing normalisation which
> Asmus describes so clearly above - which are even worse in this case
> because of canonical equivalences. (In practice the new combination
> for tréma may be used very rarely in combination with other combining
> marks, but that argument didn't wash before.) The encoding using CGJ
> also seems to be overloading this character which is intended for
> something quite different.
Read N2819 again:
"The sequences <a, ¨> and <a, CGJ, ¨> are not canonically equivalent.
[T]his means that the distinction will not be normalized away on
conversion in and out of bibliographic systems."
CGJ + COMBINING DIAERESIS is a hack, but then again the need to draw a
distinction between the exact same combining mark used for two different
phonetic purposes is a bit of a hack too.
The alternative proposed by DIN, creating a new COMBINING UMLAUT
character, would have caused *unprecedented and catastrophic*
equivalence and normalization problems.
> It seems to me that the UTC should bite the bullet and accept that
> there is a need for variation sequences for combining marks, and
> either adjust the definitions of existing variation selectors or
> encode new specialised variation selectors for them. The adjusted or
> new variation selectors can then be used for Hebrew as well as for
> German - see my posting on this subject to the Hebrew list.
"When 256 variation selectors just won't do, invent another."
(with apologies to Ken Whistler)
-Doug Ewell
Fullerton, California
http://users.adelphia.net/~dewell/
This archive was generated by hypermail 2.1.5 : Tue Jul 13 2004 - 13:43:30 CDT