Re: Latin ligatures and Unicode

From: Tex Texin (
Date: Wed Dec 29 1999 - 15:17:34 EST

Interesting comments.
I don't see why adding a character is preferable to adding a markup
that operates on a point in-between characters. Seems to me if I have
a mechanism such as a markup language, I would like all commands to go
thru the markup and not have an alternative mechanism for markup that
operates at a point rather than a span.
Having two mechanisms certainly makes it more difficult to design and
implement while insuring the two interoperate and interact reasonably.

Certainly, a good tool would provide easy keyboard generation of the
markup, just as easily as adding a character would require keyboard
generation of the character, so input is not the issue.


Gary Roberts wrote:
> Yes. There is definitely an issue of how to accomplish what one wants in
> a way that will be implemented. For example, if the solution relies on
> language tags (e.g. dictionary based solutions), then it is of little use
> if companies don't provide support for your language. On the other hand,
> the soft hyphen is generally implemented, and supports languages that
> haven't even been invented yet. Now, one could argue whether soft hyphen
> is best implemented as markup or as the addition of a new character. I
> tend to read and create markup files by hand. My tendency is to prefer
> markup when there is some span to the markup. The more characters the
> markup is likely to affect, the more I prefer it to adding a character.
> Soft hyphen is an example where there is no span at all, and it makes
> sense to solve the issue with a soft hyphen character. I see ZWL
> as a substitute for markup having a span of two or three characters, which
> still makes it attractive as a new character sollution. It also seems
> more flexible. Say that I often deal with fonts that have only ligature
> pairs, given the choice of ff i or f fi, I always prefer ff i,
> but my colleague prefers f fi. We both prefer ffi as a single ligature
> if it exists in the font. What markup gives each of us the results we
> prefer? For &=ZWL, the answer is f&fi for me, and ff&i for my colleague.
> Note that ZWNL is not useful for this case. I can speculate at the
> appropriate markup language, but I'd rather hear how others have actually
> solved this problem.
> *
> On Wed, 29 Dec 1999, Asmus Freytag wrote:
> > What is at the heart of this recurring request is that support for many
> > scripts
> > (or older typographies) is incomplete without an *interchangeable*
> > method of indicating the precesence or absence of ligatures.
> >
> > Plain text used to be the *only* medium with near universal
> > interchangeability. With the web, this has changed. It is now appropriate
> > to move this discussion on a higher plane and consider the question
> > differently:
> >
> > What is the best way to interchange text containing ligature on the web?
> >
> > Posing this question allows us to consider the full-featured typorgraphic
> > and aesthetic requirements for ligation - as well as any inherent
> > regularities. Once we have a design in place for interchanging ligatures
> > with marked up text, we can revisit that and see whether replacing markup
> > instructions by character codes gives better results.
> >
> > I feel we have explored the semantic aspects of this long enough to
> > conclude that there is some evidence that a ZWNL is linked slightly more to
> > the underlying semantic content of the text than a ZWL, but that for
> > neither case we have enough to settle the argument in favor of making them
> > characters today.
> >
> > Both concepts ('ligate here', 'don't ligate here') can in principle be
> > expressed with HTML or XML style markup - I have seen too little discussion
> > of what this markup should be like, and what the consequences are of it
> > being present in the middle of words. Is that something that the HTML/XML
> > community wants to deal with?
> >
> > The next question, assuming that we agree on what ligation commands look
> > like in markup, concerns interchange between parts of a program, e.g. text
> > processor to rendering engine. Is it meaningful to have character codes at
> > that level, or is it more typical that each ligature is it's own little
> > style run.
> >
> > The strongest arguments in favor of character codes come from those who
> > have for long time needed to 'trick' various applications into supporting
> > languages
> > that they were not explicitly designed for. If character codes would result
> > in 'enabling' many of these implementations, by letting the author
> > communicate with the rendering engine, so to speak, that is itself a valid
> > argument to consider. (It would need some actual case studies where this
> > approach is shown to work).
> >
> > Still, even that would need to be contrasted with the cost to applications
> > that do not know about these as characters and end up showing 'boxes'.
> >
> > A./

Spanish Proverb:	Don't speak unless you can improve on the silence.
Tex's Proverb:	Don't email unless you can improve on the screen saver.

Progress Software: The #1 Embedded Database ------------------------------------------------------------------------------------------------------- Tex Texin Director, International Products Progress Software Corp. Voice: +1-781-280-4271 14 Oak Park Fax: +1-781-280-4949 Bedford, MA 01730 USA -------------------------------------------------------------------------------------------------------

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:57 EDT