From: Adam Twardoch (
Date: Tue Mar 02 2004 - 02:27:57 EST

  • Next message: Peter Jacobi: "Re: LATIN SMALL LIGATURE CT"

    > But can someone explain to me why a ligature such as ct which CANNOT be
    > accurately decomposed into individual characters (at least, it can't if
    > it's designed PROPERLY) shouldn't be encoded in its own right?

    I don't think the design of the glyph has anything to do with the issue of
    encoding. The "ct" ligature in German never was considered a separate
    letter, it always was clearly identified as a special graphic representation
    of a digraph. Both "st" and "ct" always were decomposed in semantic terms.

    > How about the German double s/eszett (U+017F) a ligature of long s and s
    > which cannot be accurately built up from it's components. There was
    > probably never any doubt that the eszett would be encoded since it
    > appeared in codepages that predated Unicode but is the encoding of the
    > eszett merely thought of as an "uncomfortable compromise"?

    This is completely different. "" is a letter that is in modern use today,
    and for quite some time, it has not been considered a ligature. Note that
    one of its components, the long s has been dropped from the modern use many
    decades ago. In that case, "" has broken the link to the initial
    decomposition. "" no longer decomposes into long-s and s, just like "W"
    does not anymore decompose into a double-V.

    In short: st, ct and all used to be ligatures, a being between a letter
    and two letters. Over the years, st and ct have weakened their integrity and
    the link has been lost, while has gained semantic value and turned into a
    letter. "" is not a ligature anymore and therefore must be encoded.

    > There must be countless historical facsimile editions printed every year
    > which use the st and ct ligature extensively. The production of these
    > items would hugely benefit from having a fixed codepoint for "ct" instead
    > of it wandering all over the PUA depending on what font you're using.

    Oh, there are Latin facsimile prints that extensively use p's, r's or m's
    with various tails, waves and bars sticking from all over the place of the l
    etters, serving as Latin abbreviatures. You wouldn't suggest encoding them


    This archive was generated by hypermail 2.1.5 : Tue Mar 02 2004 - 03:20:10 EST