Re: Line Separator and Paragraph Separator

From: John Cowan (cowan@mercury.ccil.org)
Date: Mon Oct 20 2003 - 11:32:04 CST


Jill Ramonsky scripsit:

> Are the LS and PS characters actually used in real plain-text documents?

You can find such documents, but they're not common. LS was an attempt
to unify the diverse standards for line-end characters by providing a
new one, but IMHO it flopped. (XML 1.1, however, will interpret LS
as a line-end character.)

> These languages have the convention that "\n" in a string literal
> means "new line". Strictly speaking, BY DEFINITION (from the C and C++
> specs), "\n" is supposed to mean LF, and nothing else,

It means any one character that serves a new-linish function, which can
be LF or CR or NEL, for example. On EBCDIC-based systems, the native
C compiler interprets \n as 0x25, which is NEL.

> compiled on Windows will reinterpret "\n" in a string literal to mean
> either LF only (when in memory) or CRLF (when encoded to or from a file
> or stream opened in text mode).

It's any LF character that gets that treatment, of course, not just one
from a string literal. The fact that DOSish systems map LF to CRLF on
output and back on input has nothing to do with the C \n character.

> I suspect (but I don't know for sure) that the Mac
> will interpret "\n" as CR only.

Yes.

> It would seem impossible (or at least, a violation of the C/C++ specs)
> to reinterpret "\n" as LS in C/C++ ... but then again, that
> specification has already been violated, so maybe the precedent is there
> and that no longer matters.

It is not a violation.

-- 
Real FORTRAN programmers can program FORTRAN    John Cowan
in any language.  --Allen Brown                 jcowan@reutershealth.com


This archive was generated by hypermail 2.1.5 : Thu Jan 18 2007 - 15:54:24 CST