*** RECOVERED EMAIL ***: Re: Proposal: Ligatures w/ ZWJ in OpenType

From: John H. Jenkins (jenkins@apple.com)
Date: Mon Jul 15 2002 - 13:24:16 EDT

On Monday, July 15, 2002, at 09:58 AM, Doug Ewell wrote:
> No, what bothers me is that the ZWJ/ZWNJ ligation scheme is starting to
> look just like the DOA (deprecated on arrival) Plane 14 language tags.
> In each case, Unicode has created a mechanism to solve a genuine (if
> limited) need, but then told us -- officially or unofficially -- that we
> should not use it, or that it is "reserved for use with special
> protocols" which are never defined or mentioned again.

I'm not sure I agree with you here. The position of the UTC is not that
ZWJ should never be used and we're sorry we added it, which is the case of
the Plane 14 language tags. It's that the ZWJ should not be the primary
mechanism for providing ligature support in many cases. That's as far as
it goes.

> The UTC may have "intended" that ZWJ ligation be used only in rare and
> exceptional circumstances, but UAX #27, revised section 13.2 doesn't say
> that.

The latest word is the Unicode 3.2 document, not the Unicode 3.1 document.
   It says:

Ligatures and Latin Typography (addition)

It is the task of the rendering system to select a ligature (where
ligatures are possible) as part of the task of creating the most pleasing
line layout. Fonts that provide more ligatures give the rendering system
more options.

However, defining the locations where ligatures are possible cannot be
done by the rendering system, because there are many languages in which
this depends not on simple letter pair context but on the meaning of the
word in question. 

ZWJ and ZWNJ are to be used for the latter task, marking the non-regular
cases where ligatures are required or prohibited. This is different from
selecting a degree of ligation for stylistic reasons. Such selection is
best done with style markup. See Unicode Technical Report #20, “Unicode in
XML and other Markup Languages” for more information.

> It says that ZWJ and ZWNJ *may be used* to request ligation or
> non-ligation, and that "font vendors should add ZWJ to their ligature
> mapping tables as appropriate." It does acknowledge that some fonts
> won't (or shouldn't) include glyphs for every possible ligature, and
> never claims that they must (or should). It specifically does *not* say
> that ZWJ ligation is to be restricted to certain orthographies, or to
> cases where ligation changes the meaning of the text.

This is correct. Nor is this changed in Unicode 3.2. The goal is to make
the ZWJ mechanism available to people who feel it is appropriate to meet
their needs, but to try to inform them that in the majority of cases, a
higher-level protocol would be better.

Adobe doesn't have to revise InDesign, for example, to insert ZWJ all over
when a user selects text and turns optional ligatures on. OTOH, the hope
is that if ligatures are available InDesign will honor the ZWJ marked ones,
  even if ligation has been turned off.

John Hudson has recommended what seems a reasonable way to handle this in
OT. Apple will be releasing new versions of its font tools in the near
future, and the documentation will include a recommendation for how this
can be done with AAT. We've been revising our own fonts as the
opportunity presents itself to support ZWJ as well. (The system and
ATSUI-savvy applications require no revision.)

The push-back coming from the font community on the issue has to do mostly
with the communications problem that they weren't aware of it in as timely
a fashion as would have been best, and the concern that font developers
and application/OS developers will be forced to add ligature support where
they have felt it in appropriate in the past.

> ZWJ/ZWNJ for ligation control is part of Unicode. It is not always the
> best solution, but it is *a* solution, and should be available to the
> user without restriction or discouragement.

It's discouraged when it's inappropriate. It isn't deprecated. There are
numerous places where Unicode provides multiple ways of representing
something. In this instance, Unicode is trying to delineate where a
particular mechanism is appropriate and where inappropriate.

John H. Jenkins

This archive was generated by hypermail 2.1.2 : Thu Jul 18 2002 - 14:57:00 EDT