From: Adam Twardoch (email@example.com)
Date: Tue Mar 02 2004 - 02:27:57 EST
> But can someone explain to me why a ligature such as ct which CANNOT be
> accurately decomposed into individual characters (at least, it can't if
> it's designed PROPERLY) shouldn't be encoded in its own right?
I don't think the design of the glyph has anything to do with the issue of
encoding. The "ct" ligature in German never was considered a separate
letter, it always was clearly identified as a special graphic representation
of a digraph. Both "st" and "ct" always were decomposed in semantic terms.
> How about the German double s/eszett (U+017F) a ligature of long s and s
> which cannot be accurately built up from it's components. There was
> probably never any doubt that the eszett would be encoded since it
> appeared in codepages that predated Unicode but is the encoding of the
> eszett merely thought of as an "uncomfortable compromise"?
This is completely different. "ß" is a letter that is in modern use today,
and for quite some time, it has not been considered a ligature. Note that
one of its components, the long s has been dropped from the modern use many
decades ago. In that case, "ß" has broken the link to the initial
decomposition. "ß" no longer decomposes into long-s and s, just like "W"
does not anymore decompose into a double-V.
In short: st, ct and ß all used to be ligatures, a being between a letter
and two letters. Over the years, st and ct have weakened their integrity and
the link has been lost, while ß has gained semantic value and turned into a
letter. "ß" is not a ligature anymore and therefore must be encoded.
> There must be countless historical facsimile editions printed every year
> which use the st and ct ligature extensively. The production of these
> items would hugely benefit from having a fixed codepoint for "ct" instead
> of it wandering all over the PUA depending on what font you're using.
Oh, there are Latin facsimile prints that extensively use p's, r's or m's
with various tails, waves and bars sticking from all over the place of the l
etters, serving as Latin abbreviatures. You wouldn't suggest encoding them
This archive was generated by hypermail 2.1.5 : Tue Mar 02 2004 - 03:20:10 EST