Re: Latin ligatures and Unicode

From: peter_constable@sil.org
Date: Thu Dec 30 1999 - 01:36:46 EST


       Some comments in response to Marco's comments (note that I make
       various observations that support both sides of this debate):

       MC>But there are cases when having or not a ligature affects
       the meaning of the text. These are certainly rare and marginal
       cases (in fact, this is being discussed after Unicode's 10th
       birthday) but they exist and, because the *meaning* of the text
       is affected, not only its presentation, the issue should be
       addressed at the plain text level.

       Some comments have been made that italics is used semantically,
       but clearly should not be expressed in plain text. We should
       note, though, that the type of meaning borne by such uses of
       italics is in the realm of linguistic pragmatics (if your
       familiar with that branch of linguistic study), and not
       *lexical* semantics. In the case of "wachstube", however, the
       semantic distinction borne by the Fraktur ligation is lexical.
       In my mind, that a necessary condition for requiring this
       meaning to be expressed in plain text. (This condition alone is
       not sufficient, however; it must also be at least true that the
       ligation is non-predictable and obligatory.)

       MC>The contexts where these graphic variations become
       significant are mainly historical and meta-linguistic, but they
       are nevertheless important for someone.

       "Important for someone" is not a sufficient requirement for
       something to be expressed in plain text. To a typographer, the
       choice of font is important; to a handwriting expert, the
       particular shape a person uses for "s" is important; etc. But
       such information doesn't belong in plain text.

       MC>Do you remember Carlos Levoyer, the guy who was dealing with
       ancient Spanish texts? I suggested him to drop all the
       ligatures in his old books, and expand them to regular modern
       spelling in his on-line edition. But if he does not want to do
       so, he may need a way to specify ligatures like "ct" (plus some
       special letters, like the long "s"). And, perhaps, he has to do
       this in HTML, or in a database field: that is, in plain text,
       sort of.

       Surely his ligatures are not indicating lexical semantic
       distinctions but rather were merely the choice of the
       author/typesetter of the document and were non-obligatory in
       the Spanish writing system of that time. He may well want to
       accurately record and present the ligatures, but the onus for
       this case shouldn't be placed on what can be expressed in plain
       text.

       MC>A non-historical example for the need to control ligatures
       in plain text has already been done: the "fi" ligature in
       Turkish. In most roman fonts, the dot over "i" disappears in
       the "fi" ligature, because it merges with the "f"'s top. This
       aesthetic adjustment is perfectly innocent in most languages,
       because the dot on "i" has no special meaning (it is just an
       heritage from hand writing). In Turkish, however, dotless "i"
       is a separate letter so, in certain fonts, the ligature looses
       the distinction between "fi" and "fI".

       I've suggested in an earlier message in this thread that
       ideally all runs of text should be tagged to indicate their
       language. If this is done, then it would be possible for that
       information to be used by the rendering engine in shaping the
       text and for the font developer to specify that the "fi"
       ligature *not* be formed for Turkish but that it be formed for
       other languages. (Current score, as I recall: OpenType already
       provides support for such language-specific substitution; such
       support in not currently available in AAT but is being
       considered.)

       Peter



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:57 EDT