Re: Backslash n [OT] was Line Separator and Paragraph Separator

From: John Cowan (cowan@mercury.ccil.org)
Date: Tue Oct 21 2003 - 06:19:25 CST


Jill Ramonsky scripsit:

> This is axiomatically *THE* definition. Period. Everything else is
> merely quoting, rephrasing or reinterpretting this original.

Absolutely not. The *standard* for the C programming language is now
ISO/IEC 9899. The 2nd edition of K & R, much-beloved as it is, is just
two guys' interpretation of that standard, as the book itself makes clear.
What they say possesses a peculiar interest, but not a peculiar authority.

The standard itself is not on line, but the Rationale, which was
written by the same working group at the same time, is on line at
std.dkuug.dk/JTC1/SC22/WG14/www/docs/n850.ps . It makes quite clear that
*any* character set that contains the necessary characters is appropriate
for C:

# There was strong sentiment that C should not be tied to ASCII, despite
# its heritage and despite the precedent of Ada being defined in terms
# of ASCII. Rather, an implementation is required to provide a unique
# character code for each of the printable graphics used by C, and for
# each of the 40 control codes representable by an escape sequence. [...]
# Translation and execution environments may have different character sets,
# but each must meet this requirement in its own way.

In addition, the Rationale makes clear that internal newlines can be
mapped to anything appropriate on output, including CR/LF and padding
with blank spaces to fit into a card reader/punch environment:

# In the UNIX model, division of a file into lines is effected by newline
# characters. Different techniques are used by other systems: lines may
# be separated by CR-LF (carriage return, line feed) or by unrecorded
# areas on the recording medium; or each line may be prefixed by its
# length. The Standard addresses this diversity by specifying that newline
# be used as a line separator at the program level, but then permitting an
# implementation to transform the data read or written to conform to the
# conventions of the environment. Some environments represent text lines as
# blank-filled fixed-length records. Thus the Standard specifies that it is
# implementation-defined whether trailing blanks are removed from a line on
# input. (This specification also addresses the problems of environments
# which represent text as variable-length records, but do not allow a
# record length of 0: an empty line may be written as a one-character
# record containing a blank, and the blank is stripped on input.)

Anyone have the standard handy to quote chapter and verse?

-- 
Híggledy-pìggledy / XML programmers            John Cowan
Try to escape those / I-eighteen-N woes;        http://www.ccil.org/~cowan
Incontrovertibly / What we need more of is      http://www.reutershealth.com
Unicode weenies and / Fran)Bçois Yergeaus.        jcowan@reutershealth.com


This archive was generated by hypermail 2.1.5 : Thu Jan 18 2007 - 15:54:24 CST