From: Jungshik Shin (firstname.lastname@example.org)
Date: Tue Dec 17 2002 - 10:50:26 EST
On Tue, 17 Dec 2002, Stephane Bortzmeyer wrote:
> On Tue, Dec 17, 2002 at 01:28:00PM +0100,
> Otto Stolz <Otto.Stolz@uni-konstanz.de> wrote
> > I have seen many messages, originally in ISO-8859-1-encoded French,
> > that got the high-bit of every accented character chopped off, thus
> > replacing "é" with "i", "î" with "n", and so forth.
When was the last time you saw this?
> Last time I saw such problems was something like ten years ago. It was
> almost never the fault of the SMTP server, but of some programs on the
> destination machine (or sometimes the faults of funny gateways like
> X400 servers, something you cannot blame on the Internet).
Although I agree that 8BITMIME is implemented and deployed
very widely these days(it's been more than two years since I received
garbled emails due to 7bit-only path. I receive tens of emails in 8bit
encodings every day), I'm afraid it's your unique experience that the
last time you received emails with MSB stripped off was 10 years ago.
While trying to counter the exaggeration made against the ability of the
internet email to transport UTF-8 emails, you may have gone to the other
extreme. In 1992, sendmail 4.x/5.x transported more than half (if not
more) of the Internet email and they're not 8bit clean. That's why RFC
1468 and RFC 1557 were written circa 1992 for Japanese and Korean email
exchanges in 7bit ISO-2022-JP and ISO-2022-KR, respectively. (in case
of ISO-2022-JP, there's another important reason. there are two major
encodings used for Japanese, Shift_JIS on DOS/Windows/Mac and EUC-JP on
Unix) As lately as 1999, I did receive MSB-stripped emails which didn't
go through non-SMTP gateway (e.g. X400). Back then, some mail servers
still used 7bit-only sendmail 4.x, 5.x (on old Sun OS 4.x, AIX 3.x, 4.x,
HP/UX 8.x, IRIX, etc machines), old version of PMDF(old VMS machines)
and smail(on some Unix machines) while 8bit clean sendmail 8.6.x or
later had been around since mid-1990's.
Besides, some email servers still don't
abide by ESMTP standard and don't include '8BITMIME' in their response
when queried with 'EHLO' although they support 8bit clean transport
(as you wrote).
Nonetheless, I agree that these days most mail transport paths are 8bit
clean. Even if not, Base64 and QP(I don't regard them as hack as you do)
are well supported by most modern MUAs so that end-users have little
problem exchanging emails in UTF-8 (or other legacy 8bit encodings).
Most of them don't have to care whether 8BITMIME is used in transit
or which C-T-E is used, 8bit,QP, or Base64.
> > take the pains to transform 8-bit MIME to some transfer-encoding
> > supported by the receiving server.
> Very bad idea, BTW, since it mangles the mail, which can be a problem
> with applications like cryptographic signatures. I always turn it off
> and it was never a problem. In practice (do note I refer to the real
> world), all SMTP servers accept 8-bits EVEN IF THEY DO NOT ADVERTISE
> IT PROPERLY with the 8BITMIME option.
Doing this type of C-T-E change (from 8bit to QP/Base64)
automatically at the MTA level may be a bad idea, but doing this with
MUAs should not be a problem(that's what end-users choose). With most
modern MUAs supporting MIME standard very well(with notable exceptions
being Eudora and some popular web mail services), the 8bit-cleanness
of the transport path doesn't matter much for UTF-8 email exchange
as I wrote above.
IMHO, the biggest obstacle to email exchange in UTF-8 is not
7bit only SMTP but the fact that people don't feel a strong need to
switch because they think legacy encodings just work fine for them.
(not many people need to exchange emails in languages other than their
native ones, let alone multilingual emails that cross the boundary of
legacy encodings). Another obstacle is that popular web mail services don't
support UTF-8 well incorrectly assuming that there's 'the' invariant
mapping between languages and MIME charset/encodings(e.g. for French,
use ISO-8859-15/1 or Windows-1252, for Japanese ISO-2022-JP). Therefore,
even though major MUAs have no problem with UTF-8 emails, some people
get reluctant to send all their outgoing emails in UTF-8 for fear that
their correspondents with web mail accounts won't be able to read them
without some 'user-intervention'.
This archive was generated by hypermail 2.1.5 : Tue Dec 17 2002 - 11:22:43 EST