From: Philippe Verdy (email@example.com)
Date: Fri Oct 24 2003 - 17:58:47 CST
From: "Doug Ewell" <firstname.lastname@example.org>
> Jill Ramonsky <Jill dot Ramonsky at Aculab dot com> wrote:
> > Here's a better idea.
> > Let's just stick with the idea that ANY C0 or C1 control has no place
> > being anywhere in a line of text, and so any sequence of one or more
> > them will be interpretted as a line-break!
And <escape> ? (think about ANSI coloring sequences générated by
your colored version of "ls" or "man" in Linux, or to ISO2022 charsets
And <bell> ?
And <so>, <si>, <dle> ? (think about ISO646 extension mechanisms,
or about SJIS)
And <us> ? (think about tabular text data in record sets: is a data-cell
delimiter in a text data file a line-break?)
There are quite a lot of encoding rules using controls which do not
(and must not) imply a line break for these controls. An application
may need to handle the conversion of these sequences using
internal Unicode parsing and generation even if the resulting string
is downcasted to a final 7bit or 8bit subset, or to insert non-textual
sequences within Unicode strings (for example in attributed text).
I also think this would be excessive to handle all C0 and C1 characters
This archive was generated by hypermail 2.1.5 : Thu Jan 18 2007 - 15:54:24 CST