Re: Special Type Sorts Tray 2001

From: John H. Jenkins (jenkins@apple.com)
Date: Wed Oct 03 2001 - 12:45:38 EDT


At 12:12 AM -0400 10/3/01, DougEwell2@cs.com wrote:
>In a message dated 2001-10-02 9:39:31 Pacific Daylight Time,
>jenkins@apple.com writes:
>
>> BTW, I'm not aware that anybody is revising their fonts to handle ZWJ this
>> way.
>
>Well, according to Unicode 3.1 (UAX #27) they should, right?

True. (*sigh*) Implicitly, however, UAX #27 allows that other
mechanisms for specifying

> > Anyway, there is is a long-standing argument on this subject, and
>> unless I misremember the official position of the UTC, this approach
>> --specifying ligation control in plain text -- is not considered the
>> best mechanism in Latin typography.
>
>I refer back to Michael Everson's two persuasive papers in which he proposed
>a zero-width ligator. It didn't matter to me whether a new ZWL character was
>introduced or whether ZWJ was overloaded, so long as the functionality became
>available. To paraphrase one of Michael's papers, ligation should not be
>considered a fancy-text function in Latin script if it is considered a
>plain-text function in Indic and other scripts, and Unicode does provide
>support for plain-text specification of ligation for Indic scripts.

Er, not entirely.

First of all, the UTC motion on using ZWJ for ligation actually
specified that this approach was not the best for Latin typography.
For some reason, that language didn't make it into UAX #27, but more
detail on ZWJ and Latin ligature control will probably be in Unicode
3.2.

Secondly, as with Latin type, the *full* set of potential ligatures
in Indic typography is pretty much open-ended and the specific set
available in any given font is really up to the font designer.
Unicode provides enough information for the bare minimum set of
ligation rules used in Indic scripts, but unless I'm much mistaken,
it does not provide full control for an arbitrary font.

The use of ZWJ in Latin typography is really intended to provide the
same level of support. Where difference of meaning is potentially
present depending on whether a ligature is formed or not -- which is
the case in some Latin-based languages -- then ZWJ can be used to
indicate the fact. Similarly, in situations where a ligature must
not be formed, ZWNJ can be used to indicate the fact.

It is certainly *not* the intention of the UTC that ZWJ be used
everywhere to turn ligatures on and off in Latin typography. There
are a couple of reasons why that I can come up with straight off the
top of my head.

#1. The ZWJ mechanism doesn't handle well default ligatures. For
example, in typesetting English, fi and fl ligatures are fairly
standard, and most fonts will use them by default. At the same time,
it seems fairly ludicrous to ask people typing English to insert
themselves (or have the system insert) ZWJ between every "fi" and
"fl" pair that occur in the text.

#2. The ZWJ mechanism works well in the case where there is a
particular standard ligature which may or not be present in a font,
such as "ct". In this case, the rendering engine will produce the
ligature (or not) if the ligature is present in the font. That's
fine for the standard ligatures whose presence may reasonably be
anticipated.

In real life, however, a font may have a large number of usual
ligatures. In the paper I wrote on the subject for the UTC, I used
Tekton Pro from Adobe because it's such a font. One of the standard
fonts in Mac OS X is Zapfino from Linotype. It's based on Hermann
Zapf's handwriting and has thousands of glyphs even though it's
basically a Roman-only font. Among the glyphs are a large number of
unexpected ligatures, "pp", "th", even "Mrs." and "Co."

Apple's implementation of the font turns many (but not all) of these
ligatures on by default. The overall effect is a significant
improvement in the appearance of Zapfino text. In any event, having
the user insert ZWJ in plain text for these unexpected ligatures on
the off-chance that someone is going to display it with Zapfino and
have the appropriate ligature set turned on seems unreasonable.

#3. OK, so there are more than a couple reasons off the top of my
head. The way that software generally handles turning ligatures on
and off right now in virtually all programs that support it is for
the user to select a range of text and through a menu item or other
action turn a particular set of ligatures on or off. The ZWJ
mechanism doesn't allow the type designer to group ligatures in sets,
and it would increase unreasonably the burden on software to use the
current UI. That is, if I want to turn on ffi ligatures in my text
by hand, I'd have to remember to put ZWJ in twice, and if the
software did it, it would have to scan the text, compare the contents
of the scan with the set of potential ligatures in the given ligature
set in the font, and either insert or remove ZWJ between every
character pair as appropriate.

>Of course ligation control is font-specific. That is why the ZWJ solution is
>elegant -- it falls back gracefully to the two (or three...) unligated glyphs
>in the event the ligature is unavailable in the font. This is still better
>than displaying a black box, which is how William Overington's private-use
>characters would appear in most fonts,

True. Encoding ligatures as characters is a bad thing.

>or forcing the user to incur the
>overhead of fancy text.

In what way is fancy text an unreasonable burden on the user? If
anything, plain text is becoming an increasingly rare beast except in
source code.

-- 

John H. Jenkins jenkins@apple.com jenkins@mac.com http://homepage.mac.com/jenkins/



This archive was generated by hypermail 2.1.2 : Wed Oct 03 2001 - 11:44:58 EDT