Re: international characters in email subject line

From: Jungshik Shin (jshin@mailaps.org)
Date: Tue Feb 13 2001 - 01:08:04 EST


On Mon, 12 Feb 2001, Alain LaBonté  wrote:

> À 19:53 01-02-12 -0800, Sean O Seaghdha a écrit:
> >Ar 12 Feb 2001, ag 15:06 scríobh Michael (michka) Kaplan
> >fán ábhar "Re: international characters in ema":
> >
> >Of course, since my preferred mail program is Pegasus Mail, which can only be
> >configured for one character set, I can't usually read such headers anyway.
>
> [Alain] Some years ago, I was also using Pegasus mail and I was not
> satisfied with this. I then communicated with the author directly (he lives
> in Sourthern New Zealand); we engaged in a series of exchanges and I made
> him accept to carry on the character set in use without conversion [in my
> case the Windows character set]... You have to use a parameter for this,
> this is the compromise he made me accept because he was really impressed by
> the SMTP 7-bit-only-headers dogma -- which does not impress me since it
> works any way with 8-bit-clean systems [predominant nowadays in the world
> since a serious security breach, I was told, was corrected with an
> 8-bit-clean-enabling SMTP patch].

You meant sendmail 5.x vs sendmail 8.x (I'm not saying sendmail is the
only MTA on earth)? Hmm, that has become an acient history by now. Even
after sendmail 8.x had come out, there has been numerous security problems
(most of which have been fixed and sendmail has become a lot securer
than before.). Anyway, security has little to do with this.

> I'm using Win Eudora Pro -- which uses the system c.s. without any question
> (under MacEudora it does conversions though, and illogically -- well, I
> said nothing, I know some will want to kill me because they like it lie
> this -- except that it does not communicate well with the rest of the
> world;

I'm not going to kill you, but I think MacEudora is a step ahead of Eudora
for MS-Windows in that it allows uers to choose what MIME charset to use
for outgoing messages (thus, it can be used by non-Western Europeans,
to label their outgoing messages with the correct MIME charset) as well
as offering users a way to put any plug-in for MIME charset/ encoding
conversion (btw, what do you mean by 'illogical' conversion and it not
communicating with the rest of the world? The plug-in mechanism for
charset/encoding conversion is there exactly because encodings like
MacLatin is not compatible with more widely used ones like ISO-8859-1.
If you find MacLatin<->ISO-8859-1 converter that comes with Eudora is
doing something wrong, I guess you can make your own and use it, instead).
I don't know why on earth Qualcom doesn't offer the same mechanism in
Windows version.

> Win Eudora Pro is however not a model more than Mac Eudora since it
> can't adapt to MIME-tagged alien c.s.).

Right. That's why I regard it as the worst MUA in terms of I18N among
the popular MUAs. MS Outlook Express and Netscape/6.x/Mozilla are a
whole lot better than Eudora in terms of I18N. And, IMHO, plain old
(but modernized to support MIME and IMAP4 well) text-based Unix MUAs
such as mutt and pine are much better than Eudora. BTW, all of these
can work with UTF-8. Can Eudora?

> If I were the issuer of MIME, I would invent a new header that would say
> what is the implicit character set used for the headers. I hate the
> MIME-tagging (whis is btw limited to 7-bit QP) inside headers themsleves --

As used for the encoding of 8bit chars in the message header, it's called
Q-encoding (NOT QP. It's for the message body encoding). Moreover,
B-encoding(similar to Base64) is more approriate for the encoding of
8bit chars when 8bit chars are majority instead of 7bit char(e.g CJK)

> half the time they are undecoded by many email programs, bridges of all
> kinds (one daily nightmare for me is communicating with Lotus Notes, which
> my employer chose for our intranet) and mail reflectors and it makes titles
> very hard to read.

I know whatever I say would not convert you on this issue. However,
I think it's not MIME's fault (RFC 2045-2049 and updates) but those
who haven't fixed their programs for so long a time. (Lotus Notes still
doesn't support MIME. Hmm, I have no word. How come Lotus does that? )
Moreover, with B/Q-encoding of the message header, no information is
lost irreversibly so that you can always manually restore it. On the
other hand, however rare it may be, if 8bit chars. go thru 7bit gw
and its MSBs are stripped off, the information is gone forever. Some
heuristiscs may help you (especially for languages like French written
in ISO-8859-1 where only a small subset of chars are 8bit), but the
restoration never gonna be perfect (especially for cases like Korean in
EUC-KR, Simp. Chinese in EUC-CN/GB-2312-80).

> As I usually work in French, I have intensive experience with this.

Well, I'm afraid that working in French is not so good a qualification
;-) to say you have an extensive experience on this issue, namely, I18N
of MUA(Mail User Agent) because most MUAs support at least ISO-8859-1
quite well unless they're really brain-dead and support only US-ASCII.

Regards,

Jungshik Shin



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:18 EDT