Re: re:Multi-Lingual Project Gutenberg (was: Unicode plain text)

From: Timothy Partridge (
Date: Tue May 27 1997 - 15:18:16 EDT

Pierre Lewis recently said:

> With waivering faith I wrote:
> :-)
> > HTML certainly is an interesting alternative to plain text because it
> > is so universal (and, hopefully, with a stable foundation). And it
> > allows to include illustrations, annotations, &c.
> Coincidently, I was reading last nite (ironically, in "iX", a German
> magazine) about XML (eXtensible Markup Language) which, says the
> article, could replace (in the mid term) HTML as the lingua franca of
> the Web. So much for that idea...
> Es lebe plain text! (long live ~)

And what about the Standard Generalised Markup Language (SGML)? This has been
around for ages. It lets you define a set of markup tags and then use them.
HTML is a particular set of SGML tags and the SGML definition of HTML (the DTD)
is available from W3. If you are writing text in HTML I would strongly recommend
that you put a DTD version declaration at the top. e.g.
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"> which is English with HTML 3.2
markup. Then syntax check the HTML with a SGML parser to make sure it conforms.
Finally keep a copy of the DTD somewhere safe along with a copy of the matching
HTML standard so that future generations can always understand your text. (The
copy of 3.2 that I have is about 12K in size.)
You might want a copy of the SGML standard too - I don't know where to get a
machine readable copy from.


Tim Partridge. Any opinions expressed are mine only and not those of my employer

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:34 EDT