Re: Medievalist ligature character in the PUA

From: Asmus Freytag (
Date: Mon Dec 14 2009 - 19:34:49 CST

  • Next message: verdy_p: "Re: Medievalist ligature character in the PUA"

    On 12/14/2009 11:59 AM, John H. Jenkins wrote:
    > On Dec 14, 2009, at 11:59 AM, André Szabolcs Szelp wrote:
    >> On Mon, Dec 14, 2009 at 6:50 PM, John H. Jenkins <
    >> <>> wrote:
    >> From what I've read in this thread, we're dealing with a case
    >> where people reproducing medieval texts can't find all the
    >> ligatures they need to reproduce the visible content of the texts
    >> in the fonts they want to use. If medievalists are able to send
    >> texts to one another via plain-text email and understand what the
    >> text is supposed to be, then these ligatures don't belong in
    >> plain text. The people to bug would be the font vendors.
    >> I've always wondered, why philology is preferred to diplomatics. The
    >> researchers studying the art of writing the employment or
    >> non-employment of ligatures is at least as important a plain-text
    >> distinction, as it is important to a philologist whether the word in
    >> question is fickle or ſickle (=sickle).
    > Unicode's original intent was to provide a common way of supporting
    > the needs of the average, modern-day computer users. It was
    > originally assumed that people with specialized needs (such as
    > typesetting math, working with dead languages, or accurately
    > representing the visual appearance of older texts) would develop a
    > common way of using the PUA to interchange data.
    > This assumption has proven to be false,
    BTW, from today's vantage point, that happened relatively early in the
    history of Unicode, but not everybody got the message at the same time.
    > because most people with specialized needs would really prefer to have
    > a standard way of doing their work, and because the Internet and Web
    > have created a world where data is visible to everybody and not just
    > people who have made the effort to work out a common PUA use.
    Also, because from the beginning that type of approach constituted a
    fundamental violation of Unicode's "single character set - single
    character semantics" philosophy. Heavy use of the PUA would have made
    Unicode the moral equivalent of ISO 2022 — different only by having much
    larger common subset. In retrospect, the success of Unicode effectively
    guaranteed that any thought of artificially relegating significant use
    to the PUA became untenable. The web, finally, and dramatically, sealed
    the fate of this early misconception, a mere five years after Unicode
    was conceived.
    > For modern Latin typography—and certainly for almost all day-to-day
    > use of the Latin script—ligation is a stylistic choice and
    > font-specific. I certainly don't feel any need to specify which
    > ligatures to use when typing this email, or when setting up a meeting,
    > writing my Christmas letter, or playing an online game, and I
    > certainly don't read text closely enough to even notice when ligatures
    > are being used and when they're not. (Well, I do tend to find the ct
    > and st ligatures intrusive and ugly when they show up.) Almost
    > everybody almost all of the time will be content to just let the
    > computer do whatever is necessary to make the text look nice.
    This statement has an unfortunate "English" bias. In many other modern
    languages that use the Latin script, the computer will be *unable* by
    itself to
    "do whatever is necessary to make the text look nice", as soon as
    ligatures are enabled. The reason for that are of course, the rules that
    prohibit ligatures in certain contexts - rules, that like hyphenation,
    ultimately require an understanding of the meaning of the text.

    And as soon as you go back about 100 years, not even, you couldn't
    reproduce the typography of that era (even in an idealized form that
    abstracts from all the accidents of hot metal typography to concentrate
    instead on the rules). You couldn't reproduce that typography, that is,
    without having an understanding of the meaning of the text - because the
    rules for ligation absolutely were not designed for being applicable by

    Interestingly enough, you can approach the artistic judgment of the
    typesetter with clever algorithms, but those fail when the same
    typesetters used the presence or absence of a ligature to mark a
    distinction between words that otherwise had the same spelling.
    > This does create a problem for the people who *do* want and need to be
    > very specific about where ligatures are being formed, even when
    > dealing with mass-media forms such as email and Web pages.
    In some cases, modern usage simply had to abandon existing, long
    established rules, because they couldn't be automated satisfactorily.
    Collation rules have definitely been simplified in the recent past to
    accommodate the machine, not the other way around. That makes the
    statement that "English" conventions for ligatures are sufficient a tad
    self-fulfilling over time.

    > Unicode has therefore added the use of ZWJ and ZWNJ as a means of
    > specifying this level of control in plain text when it is necessary.
    > The expectation is that this is an exceptional mechanism for
    > exceptional needs.
    There are three types of exceptions that are possible:

    1) marking up the features of words that result in them receiving a
    non-default ligature handling, where the ligature is present or absent
    because of what the word *means*. For lack of a formal term, you could
    call these semantic or orthographic ligatures (or their opposite:
    ligature-free locations).

    2) marking up text for reproduction where there's a perceived value for
    matching one particular writers/typesetters choice of ligation. As no
    algorithm can be written to do that generically, each location would
    have to be marked.

    3) providing hints that improve the appearance of extremely high-end
    typography, where the placement or suppression of individual (stylistic)
    ligatures becomes part of the artistic expression.

    Common to all of these three "exceptions" is not so much their
    exceptional nature as the fact that they lend themselves to be
    implemented as "overrides" compared to a global ligation setting that is
    influenced by the document and the specific font. (Because requesting
    ligatures that are not widely implemented across fonts is a pointless
    exercise, except for final form documents, the default form of that
    override appears to be the "prevention" of ligatures).
    > As with other mechanisms Unicode has developed over the years to deal
    > with specialized needs (using combining marks for unusual accented
    > letters and the use of variation sequences in Han springs to mind),
    > the onus then falls on the font developers and rendering engines to
    > add support for these features. And sometimes there is a significant
    > lag before everybody catches up. For some specialties, there's a
    > significant lag before *anybody* catches up.
    Well put. Sad, but true.
    > From Apple's perspective, our rendering engine supports the use of
    > ZWNJ and ZWJ as Latin ligature controls if the fonts do, and we do
    > make an effort to keep our own fonts updated appropriately. We also
    > provide free tools for font developers to use in order to add this
    > functionality to their font.
    ZWNJ - which suppresses ligatures, should be handled in the layout
    engine. There's a compelling use case (item 1 above) for allowing the
    prevention of ligatures on semantic grounds. Text that is so marked up,
    would *never* have a ligature at the marked location, no matter the font
    or style.
    > Unfortunately, the only font we ship with a really rich set of
    > ligatures is Zapfino, and it's not really the kind of font a
    > medievalist would typically use. If, however, there is a font which
    > has the right data in it, it should work just fine on Mac OS.
    For medievalist, the situation is, of course the reverse.

    I'm surprised that nobody has produced a nice blackletter/Fraktur font.
    That typestyle requires the use of many ligatures (at least in some
    languages), and many others are optional and could be used by a well
    designed algorithm to create optimal type color/page layout for
    reproductions of texts that are still accessible to many modern readers....

    > =====
    > John H. Jenkins
    > <>

    This archive was generated by hypermail 2.1.5 : Mon Dec 14 2009 - 19:37:30 CST