Re: triple diacritic (sch with ligature tie in a German dialect writing document)

From: John Hudson (
Date: Wed Jun 14 2006 - 23:18:26 CDT

  • Next message: Petr Tomasek: "Re: triple diacritic (sch with ligature tie in a German dialect writing document)"

    Michael Everson wrote:

    >> My approach has been to handle such things at the font level using
    >> either ligatures, if dealing with a known and discreet set of
    >> sequences for a particular language (e.g. ch with underscore for
    >> Ethiopic transcription), or using contextually subsituted beginning,
    >> middle and end glyphs for arbitrary sequences (e.g. overscores for
    >> Greek nomina sacra). In these cases, I think it is certainly desirable
    >> to handle the underties or overties in charactar encoding.

    > So you would handle sch with triple undertie how?

    Before I suggest some answers to that, let me clarify that I was drawing attention, contra
    Ken's post, to the fact that at least least some instances of tied sequences might be
    properly addressed in text encoding, rather than as Ken suggests by markup. In the case of
    Amharic transcription, for example, you have separate C and H characters with combining
    macrons below (U+0331) and then you have CH with an underscore, and these are clearly
    related orthographic conventions, the latter being encodable as a sequence of C with
    macron below and H with macron below and displayable as a ligature. So when something like
    tied underscores have a logical place within an orthography, I think it makes sense to
    deal with them in terms of text encoding.

    Ken suggests that

            Once spanning mechanisms go beyond two base characters,
            it is no longer useful to try to treat them as encoded
            characters, as it is increasingly unlikely that appropriate
            rendering mechanisms will be available for them.

    and hence

            the representation of such in digital text should be
            handled by style and markup, rather than by seeking
            solutions in character encoding.

    The trouble I have with this is that 'style and markup' is not a magic solution that make
    rendering problems and limitations disappear. Indeed, in my experience the relationship of
    markup to rendering is often much more complicated and difficult than the relationship of
    character encoding to rendering, which has well established mechanisms within font

    There are obvious cases in which tying marks should be handled at some level above text
    encoding and typical font shaping, e.g. in music notation, whether for voice or
    instrument, or mathematical layout. Nomina sacra is an interesting edge case, because one
    can easily see how it might be sensibly handled in markup and tying lines drawn by
    applications independent of text encoding. But as it happens we have a working mechanism
    using the combining overline character (U+0305), and as I understand it the productive use
    of this character is presumed in the Coptic encoding.

    Of course, the productive use of the overline is easy, since the line is straight. Really
    good typographic representation requires beginning, middle and end glyph variants to make
    nomina sacra look nice, but a roughly acceptable rendering can be achieved simply by
    putting the base glyphs and combining marks together in a row.

    The case of curved ties, as in the sch example is more complicated, but not insoluble. The
    important thing, I think, is to give up on the idea of encoding combining tie marks, e.g.
    U+0361, altogether, and instead encode sequences of tied characters with individual
    combining marks (as in the Amharic example above) with a control character such as ZWJ
    indicating desired ligation. So the sch with undertie might be encoded as e.g.

            0073 032E ZWJ 0063 032E ZWJ 0068 032E

    Obviously, this is only going to be optimally rendered using a specialised font, but such
    sequences are specialised by nature.

     From a font design perspective, there are two methods for rendering such a sequence.
    Either a whole sequence can be rendered as a ligature, or sequences of arbitrary length
    can be handled by making the middle section of the tie straight and only the terminals
    curved. This is similar to the approach taken with growing delimiters in mathematical

    John Hudson

    Tiro Typeworks
    Vancouver, BC
    I am not yet so lost in lexicography, as to forget
    that words are the daughters of earth, and that things
    are the sons of heaven.  - Samuel Johnson

    This archive was generated by hypermail 2.1.5 : Wed Jun 14 2006 - 23:34:20 CDT