RE: Backslash n [OT] was Line Separator and Paragraph Separator

From: Jill Ramonsky (Jill.Ramonsky@aculab.com)
Date: Thu Oct 23 2003 - 10:15:55 CST


Are we completely sure about this? I mean - maybe the confusion is about
what constitutes a "text file", not about what constitutes a line break.

I would argue that a complete, valid, text file, must contain an
integral number of lines. However, were I to take a text file, and split
it into ten equal sized fragments, there is no doubt that each fragment
would likely contain a fragment of a line at the start and/or end. It is
even possible that the file fragments may contain an isolated CR at the
end, or an isolated LF at the start, which, when concatenated, would
rebuild a valid CRLF sequence. But I would question whether it was right
or proper to refer to these fragments as "text files". I would argue
that you probably shouldn't, any more than you would count a fragment of
an HTML file as an HTML file.

It isn't clear to me what it means to say "the first line in a text file
may well be partial". If by that you mean that the file is a fragment of
a larger file, and won't make sense until re-concatenated, then fair
enough, but a file in isolation? What can it mean? And what would (or
should) happen if you concatenate a file containing a whole number of
properly terminated lines with a second file in which the first line is
"partial" (assuming we know what that means).

I'd seriously suggest that we call it a "text file" if and only if it
contains a whole number of lines, in which case there must be (either
explicitly or implicitly) a line-break at the end of the file. (For
example, it really irks me when a C++ file which compiles perfectly well
on Windows fails to compile on Linux just because the last line doesn't
end in a line-break). I'd further suggest that any file containing
partial lines at the start or end should be called something else, like
a "text file fragment", or a "partial text file". Come to that, does it
even need a name at all? We don't have names for "partial Word
document", do we?

What is this desire for contatenation anyway? I don't go around
concatenating Word documents, HTML files, XML files. I don't go around
concatenating gif images or jpegs, I don't go around concatenating Adobe
PDF documents, and nor do I go around concatenating MP3 files or Excel
spreadsheets. Why then should I worry about what happens when I
concatenate text files (unless I know in advance that they're just
fragments)?

Jill

> -----Original Message-----
> From: Kent Karlsson [mailto:kentk@cs.chalmers.se]
> Sent: Wednesday, October 22, 2003 4:44 PM
> To: 'Peter Kirk'; unicode@unicode.org
> Subject: RE: Backslash n [OT] was Line Separator and
> Paragraph Separator
>
> The first and last lines in a text file may well be partial.



This archive was generated by hypermail 2.1.5 : Thu Jan 18 2007 - 15:54:24 CST