Control characters (was: furigana etc.)

From: John Cowan (cowan@locke.ccil.org)
Date: Tue Jul 04 2000 - 16:11:38 EDT


On Mon, 3 Jul 2000, Edward Cherlin wrote:

> *Some* computer system designers, noticing
> that the demands of printing terminals were not requirements on
> system file internals, chose to use either CR alone or LF alone for
> line or paragraph ends, all without coordination.

IIRC, the Model 37 Teletype interpreted 0A as a newline function,
so ASCII allowed 0A to be interpreted as either LF or NL. (Later,
these functions were assigned to the separate 84 and 85 control
characters, but the 80-9F range never really caught on...)

The Unix folks therefore adopted 0A as the internal end of line character,
conformantly to ASCII rules.

> The use of 1A SUB for end of file in several operating systems
> including PCDOS is a violation of the ASCII standard, which provides
> both 03 ETX (End of Text) and 04 EOT (End of Transmission), but who
> cared?

I think that TOPS-10 was the first OS to use this convention; if not
used there, it was certainly present in OS/8. OS/8 did not record the
exact length of a file, but only the number of blocks it contained;
the convention was to fill out the final block with 1A characters,
which were ignored by text processes. The same thing was done with
binary paper tape images, which were the canonical representation on OS/8 of
non-executable object files. Some programs inserted only a single 1A,
which then came to be thought of as an EOF mark. Presumably, since 1A
is Control-Z, there was some vague notion of Z=End.

DEC OSes notoriously distorted or misused the control characters, thus
^U = NAK was used to kill an input line instead of ^X = cancel.

-- 
John Cowan                                   cowan@ccil.org
	"You need a change: try Canada"  "You need a change: try China"
		--fortune cookies opened by a couple that I know



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:05 EDT