Re: [OT] bits and bytes

From: Nelson H. F. Beebe (beebe@math.utah.edu)
Date: Thu May 17 2001 - 18:55:00 EDT


Peter Constable <peter_constable@sil.org> asks on Thu, 17 May 2001
15:39:02 -0500 about historical byte sizes > 8 bits.

I worked on, and co- managed, a DEC TOPS-20 KL-10 system for 12 years,
until its retirement in the Fall of 1990. I recall it with great
fondness, but that is another long off-topic story.

The PDP-10 architecture had 36-bit words, and byte instructions could
work with bytes of any size from 1 bit to 36 bits. Text files were
normally stored as 7-bit ASCII, which uses up 35 bits of a word with 5
characters. The leftover bit (at the bottm end) was normally 0, but
was otherwise ignored by the byte instructions. Some text editors,
however, set it to 1 to indicate that the word contained a 5-decimal
digit line number. The file system contained an attribute that
recorded the byte size.

When C became available on TOPS-20, in the form of two compilers, one
done by Jay Lepreau of our then Computer Science Department, based on
Steve Johnson's old Portable C compiler, pcc, and one done by Ken
Harrenstein at SRI, called kcc, the issue of byte size became
significant. pcc simply treated text files as containing 7-bit
characters, and binary files as sequences of 36-bit words. However,
kcc offered extended datatypes to access 6-bit, 7-bit, 8-bit, 9-bit,
and 36-bit characters. For NFS use with file systems shared with VAX
VMS and UNIX, we worked with 8-bit characters, which put 4 such
characters in a word, leaving the bottom 4 bits unused, and set to
zero.

Honeywell and Univac systems of the 1970s and 1980s also had 36-bit
words. Univac supported 9-bit and 18-bit bytes with quarter-word and
half-word instructions, but I don't know whether the halfword chunks
were ever called bytes. We had such a machine on this campus, but I
only experienced it second hand, because without network support, it
held little interest for me. Univac text files were normally stored
as 9-bit bytes, with the high-order bit zero.

Prime systems of that era had a 32-bit word and 8-bit characters, but
put ASCII in 128..255, with the high bit always on.

Life is assuredly better today when word sizes, other than on some
embedded processors, are now uniformly multiples of 8 bits, and
characters are numbered starting from 0.

-------------------------------------------------------------------------------
- Nelson H. F. Beebe Tel: +1 801 581 5254 -
- Center for Scientific Computing FAX: +1 801 585 1640, +1 801 581 4148 -
- University of Utah Internet e-mail: beebe@math.utah.edu -
- Department of Mathematics, 322 INSCC beebe@acm.org beebe@computer.org -
- 155 S 1400 E RM 233 beebe@ieee.org -
- Salt Lake City, UT 84112-0090, USA URL: http://www.math.utah.edu/~beebe -
-------------------------------------------------------------------------------



This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:18:17 EDT