Re: smart fonts that just work (was Re: Taiwanese: unicode of o with dot

From: John Cowan (
Date: Fri Aug 18 2000 - 11:00:02 EDT

Disclaimer: This message is excessively rambling. It is not so much a
rebuttal of Peter's message as a meditation on it. wrote:

> Plain text is simply limited in what it can represent. Strictly speaking,
> it isn't even necessarily enough to ensure that a human can read the text
> and know what it's supposed to mean.

No format can ensure *that*: I know what "La tika la more na" means, but
none of you do, for sure.

Plain text *is* supposed to assure bare legibility, provided one knows the
language and the writing system: it does not allow reading the Linear A
tablets, because nobody knows how to read them at all.

At this point it may be useful to introduce Douglas Hofstadter's three-message
model of a message:

        The *outer message* says "This is a message, read me if you can!"
        The *frame message* says "This message is in Japanese using hiragana only."
        The *inner message* tells about Genji and what happened to him.

Every message has all three of these, or it isn't comprehensible. Encoding
the outer message in the inner message is pointless: if you can't read the
outer message, you will never discover that there is an inner message.

Encoding the frame message in the inner message is also pointless in principle
(if you can't read hiragana, you are hosed) but may be useful in limited
circumstances or with naive readers, like XML parsers. XML allows encoding
the character set in the inner message, even though one must know or guess
an approximation to the character set in order to find out what the exact
character set is.

> [P]lain text can only encode
> what we might call propositional/lexical aspects of meaning. It can't
> record some aspects of meaning, such as emphasis,

Are you *sure* about that? :-)

> beyond the limitations of
> the limited punctuation characters a writing system supports.

Well, fancy text can't do that either.

> And it
> certainly can't record aesthetic elements a typographer might want to add to
> a text, such as discretionary ligatures and swashes.

This I believe is the central point. Given that these facilities don't affect
bare legibility (except to destroy it if there are too many of them!), they
don't belong in plain text as defined above.

> Similarly for Turkish (and note that the
> plain text file won't have ligatures).

Well, it can if it exploits the ligature compatibility characters.


Schlingt dreifach einen Kreis um dies! || John Cowan <> Schliesst euer Aug vor heiliger Schau, || Denn er genoss vom Honig-Tau, || Und trank die Milch vom Paradies. -- Coleridge (tr. Politzer)

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:07 EDT