RE: Latin ligatures and Unicode

From: Marco.Cimarosti@icl.com
Date: Mon Dec 20 1999 - 15:04:06 EST


MC>Why is this new zero-width ligator being proposed, rather than
overloading
MC>the existing zero-width joiner with this new function? (ZWJ currently has
no
MC>defined meaning for European scripts, right?)

ME>The arguments are rather subtle, but there are good reasons for
considering
ME>the two separate. It has to do with bidi as well as the inherent nature
of
ME>the script (i.e. European scripts are inherently non-cursive, but Arabic
is
ME>inherently cursive).

JC>Well, I think "normally" rather than "inherently" is the right term. A
JC>"cursive" Latin font would be one that has contextual variants (initial,
final,
JC>medial, isolated), and controlling such a thing would require ZWJ and
ZWNJ.
JC>To render an initial "f" in isolation would require LATIN SMALL LETTER F
JC>followed by ZWJ.

ME>The ZWL is intended for use with Latin, Greek, Cyrillic, Armenian, Runic,
ME>Etruscan, and Ogham. At least I know it would satisfy the requirements
ME>those scripts have. It could also be used for some as-yet unencoded
ME>scripts, such as Cirth.

JC>Unless rendering processes are forbidden to ligature without a ZWL,
JC>then a ZWNL is also needed in order to block ligaturing "fi" in
JC>Turkish, etc.

ME>Is this likely? I wasn't aware that any of the ZW characters were
ME>obligatory. But I assume that f+ZWL+i = fi-ligature and f+i = unligated
fi.

Yes, I think it's likely. "f+ZWL+i" would be an explicitly required
ligature, "f+ZWNL+i" would be an explicitly forbidden ligarure, "f+i" would
be the programmers' favorite expression: "the default".

Many of my friends are graphic designers (poor them!) in fashion magazines.
Like Unicoders, they disagree on almost everything but, exceptionally, they
seem to be all militants of the anti-fi-ligature party. Especially in
titles, they use to put some form of zero-width space between their "f"s and
"i" to prevent Quark Express to compose the ligature (I told them there must
be an option somewhere, but they are graphic designers:-).

But my point was another. European cursive fonts seem to me a really minor
issue (certainly much less important than Michael's scholarly needs or my
friends designing preferences), so why is the only mention of an European
usage for ZWJ and ZWNJ in Unicode dedicated to this non-issue?

If it wasn't for that little passage (that, btw, was there just to explain
the Arabic alphabet to people who don't know it), would Michael have
proposed his ZWL anyway? Or would he have proposed to use the existing ZWJ
for the purpose of forcing ligatures in European scripts?

I don't understand the cursiveness example that Michael does above; what
other subtle reasons are there not to unify ZWL with ZWJ?

By the way, using ZWJ to encode Michael's ZWL does not prevent it to be used
*also* to control the shape formation of European cursive fonts (provided
they really exist and need such a thing) and, of course, does not touch its
meaning in Arabic or Indic. If this unification is done, ša va sans dire,
the ZWNL that John mentions would naturally be unified with ZWNJ.

_ Marco



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:56 EDT