Re: Latin ligatures and Unicode

From: John Jenkins (jenkins@apple.com)
Date: Mon Dec 27 1999 - 14:55:27 EST


on 12/27/99 12:37 AM, Eberhard Pehlemann at e.pehlemann@gmx.de wrote:

But from my point-of-view there are several arguments against letting the
software do everything (and preventing explicit specification of ligation
behaviour by the author):

1. As I pointed out in my reply to John Jenkins, there are cases where the
software will not be able to determine ligation correctly. (The "Wachstube"
example) In these cases the author MUST explicitly specify ligation. So
Unicode MUST provide means to do so, eather by defining ZWJ or ZWL to be the
character that specifies ligation behaviour in latin scripts. I agree with
Micheal Everson who has claimed the need for such a character.

I haven't seen anyone claim that ligation control should be handled entirely
by software. The OpenType/AAT model is explicitly opposed to that
assumption. The end user still has complete control over the process. The
software merely enables this and provides default behavior.

2. Software can do a lot, but does it do so? I am pretty sure that there
does not exist a single text editor that handles all blackletter ligatures.
But even in the much more common case of roman fonts and the fi, ffi, ffl …
ligatures: How many programs actually use ligature shapes to display these
character sequences? And how much money do they cost?

It would be possible to write such an editor for Mac OS 9 in about a page of
code. We're working on sample code and demo applications to show how this
can be done. And it's been possible to do ever since we released the late
lamented GX five years ago.

Adobe InDesign‹and admittedly non-cheap program‹can handle an awful lot of
Latin ligatures. I believe it could handle anything you put in your
OpenType font, at least manually. Given InDesign's extensive plug-in
architecture, it would be possible to get it to do whatever you want.

3. With the OpenType or related fonts every font designer can offer ligature
shapes in his fonts. I would like to do so for the (few?) people that like
blackletter fonts. If there was a ZWJ or an official declaration that
Unicode ZWJ and ZWNJ should be used to force or prohibit ligation in latin
scripts, then every writer could use the ligature shapes offerd by the
fonts. He would need only basic software abilities built-in to the operating
system and thus also accessible from small and affordable applications. He
would not need to by Indesign and then see that he gets fi ligatures but
doesn't get others like sch, tz or ft in Fraktur.

There are two issues here. One is getting system software support. The
other is getting applications to take advantage of the system software
support. The latter can be an enormous uphill battle, as our experience
getting people to support ATSUI shows.

The former is also enormously problematical. The problem is that the
TrueType spec doesn't offer any direct support for mapping multiple
characters to single glyphs‹the presumption is that this is handled in the
AAT/OpenType tables. I don't know how OpenType libraries like UniScribe or
CoolType work, but I know enough about the guts of ATSUI to say that it
would be fairly difficult to get it to handle ZWL; ZWNL would be
comparatively simple. I would imagine that OpenType would have similar
problems. Basically ZWL would be useless except for a plain-text exchange
mechanism, and even there it would be problematical.

Be aware of this: Any architectural change to Unicode requires applications
(and usually system software) be rewritten. Old applications won't simply
pick up the new support automatically. (Well, this isn't quite true.
ATSUI-savvy apps would.)

I think it's a bit on the naïve side to assume that just adding ZWL would
solve the problem. We'd be three or four years at best before we'd get
support at the OS level and then down to the app level. The point is that
there is already a mechanism in place on both Windows and the Mac that can
solve the problem. We'd be far better off IMHO to define exchange formats
for cross-platform use that includes ligation information.

4. What about portability and exchangeability of text fragments, if one
program handles ligatures and another does not? Should we forget copy and
paste, because ligation information simply gets lost when pasting from one
application to another? ZWJ ore ZWL would certainly not get lost in this
process!

Again, this is also true for typeface selection, italics, boldface, point
size, super- and subscripting, all of which carry semantic information.
Apple's solution has been to define a common rich-text exchange format. Mac
applications since 1984 have been able to interchange formatting information
back and forth. ATSUI apps can do the same, and this includes ligature
control.

The discussion which arose from my first mail is still going on. I feel that
the problem of ligatures in the latin script is not yet solved. I think that
it should be solved inside Unicode by defining control characters that
explicitly define ligation behavior. These controls then could be used in
conjunction with OpenType font tables to provide an almost
application-independent means of controlling ligature display.

Again, I disagree. If you're using OpenType or AAT tables (or equivalent
rendering technologies), you don't need the controls in plain text; and if
you're not, they won't do you any good.

=====
John H. Jenkins
jenkins@apple.com
tseng@blueneptune.com
http://www.blueneptune.com/~tseng



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:57 EDT