RE: Unicode plain text

From: Pierre Lewis (lew@nortel.ca)
Date: Sun May 25 1997 - 19:09:00 EDT


In message "RE: Unicode plain text", Murray writes:

> The preformatted plain text works OK as long as you have no plans to
> modify it. If you want to edit it, then you have to worry about
> reflowing the lines ...

Most decent plain-text editors have facilities for that.

> ... But even much older software was adept at formatting text.
> E.g., troff and TeX have been around for years and do beautiful jobs of
> formatting text.

Of course, so does HTML today. But none of that is plain text, troff,
TeX and HTML require some processing intelligence that may no longer be
around in 30 years. That may not be available everywhere.

Is there a specification somewhere that tells me how type 1 plain text
(using Tim's terminology again for a moment) will be formatted for
display and printing? Will things such as the following be dealt with
properly?

   This is a recursive bulleted list.

   o Bullet one, a very long line.....
      that folds:
      - subbullet one a, another long line....
        that folds;
      - a second subbullet

   o Bullet two.

Can I rely on this intelligence to always yield something that reflects
my intentions? With recursive bullet lists? With tables. Etc.

Ah, maybe that's what some folks mean when they ask for a standard for
plain text in Unicode?!

Or am I not more likely to see things such as what your email software
did to my original post:

> > o It's the format of all RFCs, perhaps the most widely-read
> > plain-text
> > files around,

The middle line got folded, but the software didn't realize it was a
bulleted list :-)

> Within the Microsoft email system, we use rich text ...

Well I hope you won't send me such, as I won't know what to do with it.
Is it HTML-like markup? Of course rich text can be nice, but only if
everyone has it. The nice thing about plain text *is* that everyone has
it by default. But I think that applies only to type 2, ie. plain text
with hard line breaks, ie. preformatted.

The big advantage I see of the type 2 plain text (with hard line
breaks) is that it requires *no* intelligence to render correctly. Well
Unicode requires BIDI I guess (and let's hope that won't change in the
next 30 years). But otherwise, just adjust to line length convention
(by chosing a decent point size) and you're in business. No reliance on
some S/W to do some undefined reformatting and hope it won't
misrepresent your intentions.

Pierre



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:34 EDT