From: Asmus Freytag (asmusf@ix.netcom.com)
Date: Sat Nov 24 2007 - 21:43:08 CST
On 11/24/2007 2:31 PM, John H. Jenkins wrote:
> I can't speak for other platforms, but on Mac OS X, the normal
> behavior for Latin text is that plain text uses whatever ligatures the
> font has on by default.
>
> I, at least, would argue that ligation controls don't (as a rule)
> belong in plain text, since ligation is intimately bound with the
> typographic needs of a specific font.
That sounds like the unreconstructed and simplistic view of the
character glyph model with which Unicode started out, and which caused
(and causes) all kinds of trouble.
To summarize some of the key issues that should by now be recognized as
requirements for the full character glyph model:
1) Some scripts have required ligatures
2) Some font styles have required ligatures ('ch' in Fraktur, but not in
most other Latin styles)
3) Some languages have *prohibited* ligatures
4) Some scripts require additional levels of choice (e.g. up to four in
Indic)
5) Some users have special needs for additional control over ligatures
(extreme design)
The full character glyph model needs to allow for all of these - while
minimizing impact on what gets encoded.
In that sense, the OpenType technology has gotten a lot of things right
by providing the three levels of ligatures that John Hudson summarized
here (required, standard, and discretionary).
However, feature 3and 4 will not work without giving the user a way to
(locally) override the global settings. As the requirement for
prohibition of ligatures often comes from orthography (or if you want,
from the intersection of orthography and typography) it is appropriate
to use coded characters. (In the kinds of cases that were discussed on
this list at length, allowing a ligature changes the possible readings
of the word in question - that's no longer typography pure, that's
orthography).
Luckily, it seems, that the ZWNJ is the only character that's required
for language-specific requirements - at least I don't know an example to
the contrary. The problem is, when should the user supply them. If
ligatures are disabled by default, then the ZWNJ is not needed (the
option of not using any ligatures is permissible in many type styles
(though not Fraktur)).
In some sense then, if non-required, but non-fancy, ligatures were
always enabled, users would (need to) supply the ZWNJ by default, and
text would be correctly coded. But leaving ligatures enabled by default
makes the use of plain text controls, in this case, ZWNJ required.
Otherwise, you get what are essentially misspelled words (or
mis-typeset, if you want) in these languages.
You can't have one without the other, that's why I called your statement
a "simplistic" view of the character glyph model.
I think it's worth repeating that there is another dimension where it is
in fact up to the font to make decision on ligatures.
A Fraktur font would need to supply all the ligatures that are required
in Fraktur (e.g. also the 'ch' ligature) and mark them as "required",
because in that typographical style they are required, even though in
other Latin styles they may not be. A monospaced font would not normally
support any ligatures as either required or default (as John Hudson
pointed out a couple of posts ago), because doing so would violate the
general expectation that users of monospaced fonts have of getting one
display position for each 'character'.
For Indic scripts, the model has come together over the last decade, and
it looks like all the distinctions can be represented on the character
code level.
Finally for designers, you'll always need additional controls, which
specialized applications and fonts will supply.
So, it looks like the character glyph model is finally getting broad
enough to meet the real requirements, but only if you allow both turning
on some ligatures by default, as well as the use of at least ZWNJ.
There are two unresolved concerns, and not minor ones.
One, if ligatures are suddenly enabled, what will be with all the
existing texts that were written without ZWNJ inserted. Because of the
nature of this issue it is *not* possible, to supply these via the
layout system. That's a real problem for languages that have such
requirements.
Two, many non-monospaced fonts are used in environments (such as text
input widgets) where a more 1:1 relation between what is typed and what
is displayed is appropriate.
Both of these essentially require that at least some ligatures can be
globally disabled for certain documents or certain uses. Because some
ligatures are required (and because this requirement varies by font
style, not merely script) the old, "ligatures on, unless globally
disabled, or overridden in script specific instance by character code
(e.g as in lam-alif)" model was to simplistic.
Technology, if widely implemented, feeds back onto usage. It would be
interesting to see whether widespread adoption of a simplistic ligatures
on by default model would result in languages that now have prohibited
ligatures to give up on this concept. Already, the inability of
spell-checkers to handle unlimited compound nouns has had a noticeable
impact on German spelling (books have been written on that subject...and
"Word" is the villain).
A./
This archive was generated by hypermail 2.1.5 : Sat Nov 24 2007 - 21:45:11 CST