Re: 8859-1, 8859-15, 1252 and Euro (bis)

From: A. Vine (
Date: Thu Feb 10 2000 - 15:59:30 EST wrote:
> Alain LaBonté wrote:
> >Example of horror stories:
> >>To: =?Windows-1252?Q?Alain_LaBont=E9?= <>
> >>Subject: =?Windows-1252?Q?Papier_=E0_lettre?=
> A friend of mine is experiencing a similar nightmare. Any clue as to why the
> Subject and other fields are "corrupted" this way?

Not corrupted. Conforming to RFC 2047 with an MS twist. Note even when you
request 8-bit MIME, headers must remain 7-bit. And so charsets using the full
8-bits must be encoded either as quoted-printable or base64. The charset must
be stated, otherwise the email interpreter can't be certain what charset the
data is in, and the encoding must be stated since it can be either QP or
base64. Hence:


a - start of "encoded word"
b - charset
c - delimiter between encoded word elements
d - Q for quoted-printable or B for base64 (case-insensitive)
e - restricted ASCII in QP
f - possibly a space interpretation
g - a-with-grave-accent in QP form
h - end of "encoded word"

FYI. Aren't you glad you asked?

Anyway, some mailers give you the option to send a message in straight 8-bit, at
which point header text is in an unknown charset, and the recipient's email
client can only guess that it's the default charset of the recipient. In
straight 8-bit, there are likely no MIME headers for the body, and therefore no
body charset to guess from. In addition, that would require 2 passes of a mail
message for interpretation, which is either not advised by the standard, by the
spirit of the standard, or possibly specifically forbidden - I can't remember


Andrea Vine,, iPlanet i18n architect
A word is not a crystal, transparent and unchanging--it is the skin of a
living thought, and may vary greatly in color and content according to the
circumstances and time in which it is used. - Supreme Court Justice Holmes

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:58 EDT