Re: Line Separator character

From: Martin J. Duerst (mduerst@ifi.unizh.ch)
Date: Fri May 16 1997 - 16:38:01 EDT


On Wed, 14 May 1997, Adrian Havill wrote:

> Martin J. Duerst wrote:
> > Email has very strict restrictions on this. You can't send doublebyte
> > UTF-16 or UCS-2 in Email. CRLF always has to be present as a line
> > separator. Unicode in Email is possible with UTF-7 (and CRLF as line
> > separator) or UTF-8 + BASE64/QuotedPrintable (and CRLF...).
> > Please see RFC 2045/6/7 for this.
>
> I'm aware of this. Allow me to clarify: encode the Unicode line and
> paragraph separators in UTF-7 and transmit no CR and LFs. Some
> protocols, such as SMTP, have a line limit (998 octets in the case of
> SMTP).

SMTP email requires that line breaks be encoded as CRLF for all
things that are text (i.e. Content-Type: text/*). The user
(or the user agent) is also asked to limit line length to
something like 80 characters (actually 80 bytes).

> However, as the behavior of CR and LF is system dependent, an e-mail
> client could theoretically ignore CR LF, etc and go by the UTF-7 encoded
> Unicode line and paragraph breaks, when

CR and LF are system dependent, but in mail, it's always CRLF, and
mail user agents do the conversion.

> RFC2046 says '[i]t should not be necessary to add any line breaks to
> display "text/plain" correctly....'

That's because text/plain (and all of text/*) is already defined
to have these as CRLF, at 'short' intervals.

> So why not NOT use them and go with
> the Unicode ones?

Because that may (or actually will) break some mail software.
I know many people don't like that (I don't either), but some
things in Internet mail are braindead, and will stay braindead.
Too many influential people are too used to the way things are,
and too many people are affraid of some software failing to work.

Of course, what you can do is to have your local user agent
change from CRLF to whatever line breaking convention you
use locally, which might very well be the "true" Unicode codes.

> As there are few legacy Unicode-capable e-mail clients, is it not
> possible to push to get this functionality added now?

The problem is not the clients. The problem is all the software
that the mail passes from one client to the other.

Regards, Martin.



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:34 EDT