From: James Kass (email@example.com)
Date: Sun Nov 04 2007 - 14:53:25 CST
Jukka K. Korpela wrote,
> Typographic ligation is generally not a character-level issue at all.
It depends on the level of granularity desired. Some people
reproducing, say, a medieval English text would be happy to
just capture the essentials of the text. Others would want to
capture the text exactly (or as close as possible), even to the
extent of forming ligatures where the original typesetter
used a ligature and not forming a ligature where the original
typesetter did not. And, those old typesetters weren't always
consistent, even in the same book/document. They might
have had a shortage of the ligature types and just used
the next best thing.
> Thus, by changing "æ" to "a", ZWJ, "e" you change the identity of the data
> as characters, just as you would by changing "w" to "v", ZWJ, "v". Some
> characters (including "w", "ñ", and "ß") have _originated_ as ligatures but
> turned into characters.
> With "æ", there's the particular problem that in some languages, it is
> definitely a separate character that is not decomposable into anything,
This reminds me of Tamil, where many modern users consider "X" (ksh)
to be a separate letter, regardless of its antiquarian origins. "æ" is a
ligature which some languages may treat as a separate character,
just as some languages consider digraphs (strings) to be separate
> whereas in some usage, mostly in writing words of Latin origin, it is a
> ligature of "a" and "e". In the latter role, it's somewhere between a
> typographic ligature and an expressive ligature. When encoding data, you
> need to make up your mind and write it either as LATIN SMALL LETTER AE
> (thinking that this may, in addition to other use, be used to represent the
> ligature) or a "a" followed by "e".
> Using ZWJ between the letters in the latter choice is more or less
> illusionary. You would be asking for typographic ligature (or cursive
> joining) in general, not the established form of LATIN SMALL LETTER AE in
> particular. Your request might well be denied, because implementations need
> not perform ligation, or they might implement it for some character
> combinations only.
On my system, with my default set-up, "a" + ZWJ + "e" displays the same
glyph as LATIN SMALL LETTER AE. So, I consider its use to be simply
an alternate spelling which displays no differently.
(Since there is a dedicated character for "ae" ligature, I use
that character as needed, and only trot out the "a" + zwj + "e"
for purposes of illustration.)
>> If ZWJ requests a more joined form of two characters if
>> the system can provide one, and the user desires to represent
>> the AP SYMBOL in plain text, and the user inserts a ZWJ
>> between "A" and "P", and the system can provide a more
>> joined form of A+P, and that more joined form happens
>> to match the desired appearance, is there a problem?
> There need not be any problem. The ZWJ character is meant to be used for
> suggestions on ligation in exceptional situations where ligation cannot be
> handled at some other level.
> But ZWJ just asks for ligation or cursive joining, not any _particular_ kind
> of ligation. Thus, you might end up with something that is clearly not just
> the letters "AP" in normal presentation and clearly not the commonly used AP
Yes, this is true. We might get some other joined form of "AP". For example,
somebody could view the text with a specialty font designed to generate the
logo of the Academic Press,
( http://www.academicpress.com/brochures/academicpress/ )
...but, in my opinion, that wouldn't be so awful. If it was someone's
intention to capture the difference in plain text between where
the newspaper used the ligature/symbol between parentheses and
where the newspaper only used the letters AP in parentheses, the
difference would still be captured in the text stream. And the
difference would still be visible in the display.
This archive was generated by hypermail 2.1.5 : Sun Nov 04 2007 - 14:57:17 CST