Re: Latin ligatures and Unicode

From: Kenneth Whistler (
Date: Wed Dec 29 1999 - 19:22:06 EST

John Jenkins replied to Marco Cimarosti:

> on 12/29/99 5:11 AM, at
> wrote:
> > I would like to stress one point. If I am not totally wrong, Unicode should
> > be a standard to encode *plain text*.
> > AAT, OpenType, or any other font technology should not be considered as
> > *prerequisites* for displaying Unicode.
> > Or is any particular font technology now *required* by the Unicode standard?
> > Or is it now "non conformant" to use bitmapped fonts?
> >
> AAT, OpenType, or some equivalent technology is and always has been a
> prerequisite for displaying Unicode. The standard has been designed from
> the beginning with the assumption that an intelligent rendering engine is
> available which can implement the character-glyph model in some fashion and
> display N characters using M glyphs with rearrangement and reshaping along
> the way.

I think John's response is a bit overblown. It is true that the designers
of the Unicode Standard have always (meaning since 1988 at least) assumed
the availability of an "intelligent rendering engine" as part of the text
handling model for Unicode. But in so doing they were thinking, from the
outset, about the issues of combining marks, ligatures and conjuncts, bidirectional
text handling, and other complexities inherent to the full scope of written text.
It was obvious from the start that no character-cell terminal with bitmaps
was up to the general task, and that a several-layer abstraction between
characters in a text backing store and dots in a display raster was going
to be necessary to do justice to the general problem of rendering.

BUT... conformance to the Unicode Standard does *not* mean that you have
to implement a rendering engine that can handle Arabic, Khmer, *and*
Mongolian to professional typesetting specifications.

One could be implementing a Braille device driver that uses Unicode 3.0
Braille symbol character codes for transmission, and that does not use *any*
font at all for rendering, for example.

It is also conformant to make use of Unicode chart fonts, with fixed
glyph shapes associated with fixed character codes -- as long as the
process that is doing so is doing so intentionally and is not making
bogus claims about correct visual layout of Arabic, for example. The
production of the standard itself makes such *conformant* use of a chart
font to enable the printing of the code charts.

And since no Unicode implementation is forced to interpret all Unicode
characters, it is perfectly possible to constrain one's interpreted
repertoire to some fixed small set that *can* be implemented with a
simple one-to-one character-to-glyph representation model. As long as
the Unicode text content is rendered *legibly*, in accordance with the
intended semantics of the characters, and is not "garbaged" by a
misinterpretation of the intended values of the characters, that would
have to be considered conformant.

And finally, there are plenty of "backend" processing implementations
of the Unicode Standard that have no rendering -- and that therefore
do not have to worry about the complexities of visual display.
> Unicode has also made the assumption that out-of-band information is
> required to provide the full range of "proper" display required by users --
> e.g., in Unihan where it's acknowledged that Japanese readers won't want to
> see characters written using Taiwanese glyphs.
> "Plain text" in Unicode means (theoretically) the minimal amount of
> information for legible display.
> In this sense, using bitmapped fonts is conformant if and only if the bitmap
> font technology can implement the character-glyph model and would be better
> off if some kind of outside markup were available to finesse the display and
> provide the not-plain-text information.

I don't think this "if and only if" statement can hold for Unicode
implementations in general. Bitmap fonts would be hard-pressed to
deal with the minimal display requirements for many complex scripts,
but it is not beyond the realm of engineering possibility to keep
extending existing approaches. For complex scripts it just isn't worth
the effort, basically, when better approaches using "smart" outline
fonts exist.

But in any case, the requirements for legible display of a given
piece of well-formed Unicode text vary from script to script -- and
not all require the same level of sophistication that Arabic or
Mongolian do, for example.

And Marco, you can put your mind at ease. You will search long and
hard -- and in vain -- in the Unicode Standard, Version 3.0 for any
formal conformance statement that would require an implementation
to make use of a *particular* font technology -- or indeed, of any
font technology at all -- in order to be conformant to the standard.


This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:57 EDT