Re: Control characters

From: john (john@nisus.com)
Date: Wed Jul 05 2000 - 14:56:00 EDT

Next message: Mike Newhall: "Writing a Unicode library from scratch vs. Off-the-shelf"
Previous message: Asmus Freytag: "Re: Plane 14 tags and SCSU"
Next in thread: John Cowan: "Re: Control characters"
Maybe reply: John Cowan: "Re: Control characters"
Maybe reply: Edward Cherlin: "Re: Control characters"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

> John Cowan wrote:
>> On Mon, 2000 July 3, Edward Cherlin wrote:
>> *Some* computer system designers, noticing
>> that the demands of printing terminals were not requirements on
>> system file internals, chose to use either CR alone or LF alone for
>> line or paragraph ends, all without coordination.

> IIRC, the Model 37 Teletype interpreted 0A as a newline function,

Also models 33 and 38, which also interpreted x0D as carriage return.

> so ASCII allowed 0A to be interpreted as either LF or NL.

That's non sequitur, but folks are like that.

> (Later, these functions were assigned to the separate 84 and 85
> control characters, but the 80-9F range never really caught on...)

> The Unix folks therefore adopted 0A as the internal end of line character,
> conformantly to ASCII rules.

>> The use of 1A SUB for end of file in several operating systems
>> including PCDOS is a violation of the ASCII standard, which provides
>> both 03 ETX (End of Text) and 04 EOT (End of Transmission), but who
>> cared?

and file, record, group and unit separators.

> I think that TOPS-10 was the first OS to use this convention; if not
> used there, it was certainly present in OS/8. OS/8 did not record the
> exact length of a file, but only the number of blocks it contained;
> the convention was to fill out the final block with 1A characters,
> which were ignored by text processes. The same thing was done with
> binary paper tape images, which were the canonical representation on
> OS/8 of non-executable object files. Some programs inserted only a
> single 1A, which then came to be thought of as an EOF mark. Presumably,
> since 1A is Control-Z, there was some vague notion of Z=End.

> DEC OSes notoriously distorted or misused the control characters, thus
> ^U = NAK was used to kill an input line instead of ^X = cancel.

This is somewhat understandable. NAK is negative acknowledge, which
generally means the end that sent it is throwing away the last block/
packet/byte sent. Since some of these editing commands were actually
merely echoed back from the main processor to the comm control
unit through which the terminal was connected, there was some
fogging over of the concepts of source and destination. The comm
controller would buffer up what was typed until it got a CR (0x0D)
and so these editing controls were actually commands to that comm
controller to clear its buffer.

Next message: Mike Newhall: "Writing a Unicode library from scratch vs. Off-the-shelf"
Previous message: Asmus Freytag: "Re: Plane 14 tags and SCSU"
Next in thread: John Cowan: "Re: Control characters"
Maybe reply: John Cowan: "Re: Control characters"
Maybe reply: Edward Cherlin: "Re: Control characters"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:05 EDT