Re: Backslash n [OT] was Line Separator and Paragraph Separator

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Fri Oct 24 2003 - 06:32:53 CST


From: "John Cowan" <cowan@mercury.ccil.org>

> > Still, I stand by saying that \n is defined in C++ as LF and \r as CR,
because
> > that's sitting in front of me in black and white.
>
> Yes, true. But that does *not* mean that (int)'\n' can be counted on to
> be 10, any more than (int)'a' can be counted on to be 65. \n corresponds
> to the LF in the source character set, whatever that is. On Mac Classic,
> \n is 15, and on EBCDIC systems, it's also 15, though for a different
> reason.

Correction: On Mac Classic and in EBCDIC, \n is 015 (or 13), not 15:
please don't mix in the same sentence the decimal,
and octal notations.

The situation is however more complex: '\n' is to be bound on LF only if
the character set contains and traditionnaly uses this character.

However for me, in C compilers for IBM MVS, the '\n' source constant is
bound to the EBCDIC NEL character, not to LF, simply because it is the
normal character used to terminate a line in a text file...

However it is true that C compilers for MVS will accept indifferently that
source files contain lines terminated by LF or by NEL. This does not
mean that a source line like printf("Hello, world!\n"); will generate a
LF character in the output stream: the string will contain a '\n' character
bound to NEL, which may be converted at run-time into a LF character
depending on the environment output stream properties for text files.

In systems like MVS or VMS where I/O resources can be set properties
in the environment before the program runs, these effect of these
converters is not part of the language itself, but of the filesystem and
I/O subsystems which include these code converters. This is even more
important in record-based filesystems, where each text line is stored in
a separate record with a unbreakable I/O that does not support
unrestricted streams of bytes (for example card punchers).

In CPM and MS-DOS, there are a few systems calls that also include such
converters (in the console I/O functions, but not in the general file I/O
functions), and the run-time library linked to the program (also applies to
Windows) also includes the support for the "t" flag for fopen() standard
I/O APIs (using FILE*) that includes the converter between an external
physical CRLF code sequence and an internal LF=='\n' char as seen by
the application; despite this, the effects of this converter is not part of
the language itself, but of its optional library.

If you look even further, you may find some record-based files that use
the ASCII RS (Record Separator) as the only correct end-of-line marker,
which should be the one that gets generated from a source '\n' constant
in the application.



This archive was generated by hypermail 2.1.5 : Thu Jan 18 2007 - 15:54:24 CST