Hannu> ASCII-style plain text files deserve to die.
I don't think you'll get any argument on that here, but the point deserves
to be reinforced. Not only are the printable characters in ASCII grossly
inadequate even for English, but "plain text" files are neither plain nor
text. They are usually manually composed printer (now display) command
strings, including a seriously inadequate set of formatting commands which
in addition is in no way standardized. The biggest hassle today is the
conflicting usage for end-of-line sequences (<CR LF>, <CR> alone, or <LF>
alone) but minor hassles still exist. Should end-of-page markers <FF> be
included? Does backspace mean erase or overstrike? Should line ends be
marked? Can paragraphs be marked?
And with all this, we are still stuck with ASCII for many years to come,
for E-mail, HTML, and other purposes. Yes, some of us can use MIME, and
HTML accepts ISO Latin-1 characters, but these are just bandaids.
I have some hope that Unicode plain text will prove useful, but I expect
people to mess with it. If we are going to standardize it, let us not add
anything that will cause these complications. This particularly includes,
in my opinion, language tags. Let plain text be plain. If you are marking
up language IDs, then admit you are doing markup, and either use a standard
markup language consistently, or use some rich text format.
(This extends an argument I made in my report, The Worldwide Impact of the
Unicode Character Set, Character Type, 1994.)
See you at the conference.
Edward Cherlin Cherlin@SnowCrest.net
MIDI, MTG, APL, Unicode, Go 916 938 4684
Everything should be as simple as possible--
_but no simpler_. Albert Einstein
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:32 EDT