Re: Chromatic text, ligatures and Fraktur ligatures

From: Michael Everson (
Date: Mon Jul 08 2002 - 08:34:34 EDT

At 10:40 +0100 2002-07-08, William Overington wrote:
>Michael Everson wrote as follows.
>>Your courtyard codes and your scientific chromatic explorations are
>>not appropriate uses of the standard. With Quark XPress I can set my
>>fonts to display in HUNDREDS OF THOUSANDS if not MILLIONS of colours,
>.... .
>Courtyard codes and chromatic fonts are, in my opinion, entirely appropriate
>uses of the standard.

Your would be wrong.

>Recently I was referred to an ISO document about characters and glyphs,
>ISO/IEC TR 15285. [...] Courtyard codes and codes for chromatic
>fonts, in my opinion, fall within the definition of character in
>Annex B of that document.

Then you have not understood the definition, or you are twisting it
to your own ends. The question is, are you twisting it because you
really just don't get it, or are you doing this deliberately to waste
our time and get some attention? Because it sure looks like one or
another at this point.

>Courtyard codes also allow the use of millions of colours. There are 18
>codes for changing colour, 16 for specific colours and 2 for colour 98 and
>colour 99 which can be set to any of those millions of colours using other
>courtyard codes.

This "technology" is useless because there are already solutions in
use by REAL applications involving text markup.

>Courtyard codes are, in my opinion, very important for the future of
>broadcasting using the DVB-MHP system. They will enable Unicode text files
>to carry colour and formatting information which can be straightforwardly
>interpreted by a variety of relatively small Java programs from a variety of
>content providers.

There are other methods of carrying colour and formatting information
which are already in use. It is called markup.

>The advantages for the broadcasting of educational multimedia across
>whole continents will be enormous if a consistent set of codes for
>colours and basic formatting is widely used in a consistent manner.

Doh! Look! They've already invented it! IT'S CALLED MARKUP. Woo-hoo!

>Certainly if such a set were provided in plane 0 of regular Unicode
>then that would be magnificent, yet in any case, that takes time and the
>need to gain a consensus as to the use of a particular set of codes is now,
>and courtyard codes are, as far as I am aware, the only set of codes
>available to do the job at the present time.

You've deluded yourself into thinking that this is the way it should
be done. It isn't, and therefore Unicode will never contain such
codes. Get it? You're wasting your time and ours.

> >If you can't support Unicode on older
>>systems then that's because the systems aren't good enough.
>Ah! A digital divide issue.

You've misused the term "digital divide". It does not have to do with
software versioning.

>Windows 95 and Windows 98 systems, which are
>not very old at all, cannot, as far as I am aware, support advanced font
>technology such as OpenType. In addition, these advanced font technologies
>are not part of the international standards and it seems to me that it is a
>good thing for Unicode to provide facilities for advanced font usage, yet
>quite another thing to start cutting off support routes for users of older
>equipment, even when that equipment is only three years old.

Tough. That's the nature of software development. You try to support
older data, but you don't resort to hacks to simulate new
technological abilities in old systems. You take it as read that
people will have to upgrade their software, hardware, memory, or

Advanced font technologies should not be part of international
standards. That isn't what international standards are for. Unicode,
as has been pointed out to you before, isn't an international
standard, although its repertoire and architecture is identical with
the repertoire of ISO/IEC 10646.

> >Are PUA hacks to fix that a productive use of energy? One can't support
> >everything in legacy data.
>You appear to be referring to my definition of the golden ligatures

All of your PUA "work", actually, not just that particular one.

>Well, first of all, I feel that the word "hack" is inappropriate.
>The golden ligatures collection is a published list of Private Use
>Area allocations. The documents clearly state what they are and
>what they are not.

It allocates code positions for ligatures when it is the stated
intent of the standard not to do so. And it does so in order to
provide some sort of bogus support for "older systems". I think
"hack" is quite descriptive of what you are trying to achieve via
character encoding as opposed to markup.

>The fact of the matter is that people who vote on these matters, largely
>only having a vote because they are the representatives of large
>corporations, have decided that no more precomposed ligatures will be added
>into Unicode.

Because the ones that are already there are only to support legacy
data, and they are not recommended for use, and should be normalized
to their constituent parts. This was the right decision to take.
Whether representatives of corporations large or small, or
representatives of ISO national bodies, the people who took this
decision did so with a profound understanding of character processing
and encoding requirements. There are right ways and wrong ways of
doing things. We do things the right way. Your PUA "allocations" do
thing the wrong way.

>Since you raise the matter, however, I do feel that adding U+FB07 as
>a ct ligature would be useful

Why? It's unnecessary. It would get normalized to c + t anyway. There
are other ways of doing ligation.

>and, indeed, the golden ligatures collection is designed so that the
>chosen code points dovetail nicely with the code points of the
>U+FB.. block of regular Unicode: the issue seems more one of the
>politics of simply ignoring the
>needs of people who are not using the very latest equipment,

Um, what needs? Which people? Do we see a swarm of people coming to
us complaining that they can't ligate Latin correctly? Do you think
that we would be so stupid as to ignore such an issue if ENCODING the
ligatures were the right way to do it? It is the wrong way to do it,
and that is why we don't do it.

>for I feel that the committees could quite easily include those
>ligatures if they wanted to do so, the amount of additional
>programming for software systems to be able
>to decompose them is perhaps not too great if someone has already programmed
>decomposition of the seven existing precomposed ligatures. It is simply a
>political issue of providing facilities for people who are not using the
>very latest equipment.

No it isn't. Ligation shouldn't be done by character encoding.
Ligation is a presentation effect. Encoding is of basic meaningful

>One of the committees appears to be meeting in August on four days
>in three rooms in two buildings in one week. How long would it
>really take them to decide to add in a few extra ligature characters
>into the U+FB.. block in order to resolve a problem that can only at
>present be solved within the Unicode system using Private Use Area

An eternity, because the committee will not decide to do so because
this is not the correct way to encode text or ligatures.

>I notice that some Fraktur fonts use code points for ligatures which are not
>part of the Unicode system.

Text written using those will have to be normalized and will require
some intelligent ligation in the fonts.

>I would have thought that it would, on balance, be better for the
>committees to review the matter and add the ligatures into the
>U+FB.. block once and for all. Fraktur is not going to go away and
>neither are the ligature decisions involved in typesetting Fraktur
>and neither is historical research into 18th Century English printed
>books and so on.

You would have thought wrong. The committees don't want the ligatures
in the U+FB.. block either; they are only there for legacy reasons.
They are not an invitation for more even to complete the set, because
they aren't supposed to be USED.

>On another aspect of this matter of ligatures, I may be wrong and it may
>just be me who feels this way, yet I do feel that there is perhaps a feeling
>that it is nice to be able to set individual electronic units which are in
>one to one correspondence with the metal units which historical printers
>actually picked up out of a type tray and placed in a composing stick.

No, there isn't any such feeling.

>I have recently started making fonts, using the Softy program, (shareware
>which is available on the web) and I do like to have a ct ligature at
>decimal 59143, which is U+E707, as if it were a piece of metal type in some
>sort of virtual olde worlde printshop. Is that just me or does anyone else
>get that feeling too?

It is just you, William. Unicode is about character encoding. This
means you have to understand what characters are. When you do, you'll
know that what you want to do is enable your readers to sort and
search and interchange your documents. If you want special ligatures,
use glyphs and ligature tables in your OpenType fonts. Then you can
get the ligatures you want.

To sum up, William, I think you should give up on trying to invent
anything and putting it into the PUA. You do not appear to understand
the character glyph model sufficiently. Perhaps it would be
beneficial to you to learn about OpenType formats and design a nice
font with ligatures in it WITHOUT using any PUA codes. When you have
done that you will understand how you can get what you need.

Michael Everson *** Everson Typography ***

This archive was generated by hypermail 2.1.2 : Mon Jul 08 2002 - 06:58:02 EDT