From: Doug Ewell (email@example.com)
Date: Tue Feb 11 2003 - 00:27:26 EST
Kenneth Whistler <kenw at sybase dot com> wrote:
> Long ago
> it was decided that it would not be a good idea to extend
> formal character decomposition to such base letterform shape
> changes or bars across letters. (Note that Latin characters
> with bars: barred-b, barred-d, barred-i, barred-u, barred-l,
> and the like are also not decomposed formally. Similarly for
> Latin letters with hooks, and so on.)
> So formal canonical decompositions are almost entirely
> confined to separable, accent-like diacritics (acute,
> grave, diaeresis, and so on). The only significant exceptions are
> the cedilla and ogonek, which attach smoothly to letter
> bottoms without otherwise distorting them, and which
> often have graphic alternates that are, indeed, separated
> diacritics (comma-like and reverse-comma-like forms).
I always wondered why the with-acute and with-circumflex letters were
decomposable but something like U+0141 LATIN CAPITAL LETTER L WITH
STROKE was not. After all, Unicode has combining "overstruck
diacritics" like U+0337 COMBINING SHORT SOLIDUS OVERLAY; isn't that what
one would use to compose an L-stroke? Same for the Maltese and Sami
letters that use a horizontal stroke instead of a diagonal. It always
seemed kind of random to me.
Ken's reply explains why Cyrillic descenders and the like, which distort
or deform the base character in some way, are not decomposable, and I
can buy that, but I still don't see why stroke overlays are lumped in
with that group. They don't distort the base form any more than
cedillas and ogoneks do -- and isn't this a glyph issue anyway?
Of course, the important thing is that they are NOT decomposable, for
whatever historical reason, and won't be in the future.
This archive was generated by hypermail 2.1.5 : Tue Feb 11 2003 - 01:13:08 EST