Re: Communicator Unicode

From: Martin J. Dürst (
Date: Sun Oct 12 1997 - 10:32:15 EDT

On Wed, 1 Oct 1997, Alain LaBonti SCT wrote:

> >> Declaring that all of a sudden current 8-bit coding, untagged, is UTF-8
> >> (for which support I am all in favour, of course, if it is tagged), would
> >> disrupt current practice that works well and could easily work better when
> >> different encodings are used between the sender and the recipient. Again,
> >> the only coding unaffected by assuming that 8-bit data is UTF-8 would be
> >> 7-bit ASCII. To this I am opposed.

> [Alain] :
> I am of course very sensitive to your argument saying that this would be
> punitive to those who respected the standard, if it is really the case. But
> is it? Right now recipients who follow the standard (or users who are
> forced to respect a standard because they're stuck with a poor mailer) are
> already punished anyway because they deprive themselves of an accurate
> interpretation that would be easy to guess, as you yourself say that you
> would do if you had the opportunity. So it is more than a standard, it is a
> dogma that is really annoying everybody. Such standards should be
> corrected, or at least guidelines be given to adapt them softly, in
> particular because the practice to use 8-bit characters in headers is
> spread worldwide already, just because it makes sense (but it is also done
> by end-users who have no idea that what they are doing is a sin against
> "good engineering design"!)

Alain - Your "spread worldwide" is somewhat biased. I have never seen
it e.g. in Japan. And it's not the user's problem, good software
handles that all by itself.
There is an old Internet principle: Be liberal in what you accept,
be conservative in what you send. What I was proposing was to
accept raw 8-bit headers of the sort you are describing. What
you are describing is to make this practice a recommendation,
so that mailer implementors start to think it is okay to send
such stuff. There is a big difference between these two things.

> Another approach would also be to allow tagging standard 8-bit character
> sets totally in front of a full string, which is apparently not the case
> today (those who use more efficient 8-bit coding are punished everyday,
> even if they also use old ISO standards [ISO/IEC 8859 series began to be
> adopted in 1987, at a time when even the"/IEC" was not part of the names of
> the IT standards]!)
> Even if today I wanted to do (I invent syntax slightly here, but it is just
> an illustration):
> (=?iso-8859-1?Alain LaBonté=)
> or even:
> (=?UTF-8?Alain LaBont$@C)(J=)
> ...I would not even be allowed to do it! That's a sin too!

Currently, we have Q for Quoted Printable, and B for Base64.
I can quite immaginge that at one point we will get an additional
R for raw. With that, your examlpes, in full syntax, would look like:

                       =?iso-8859-1?R?Alain LaBonté?=
                       =?UTF-8?R?Alain LaBont$@C)(J?=

Maybe I'll write an internet-draft about this when I find time :-).
My proposal, as already stated, would be to allow the later one
directly, as:
                        Alain LaBont$@C)(J
to get rid of the encoded-words in the long term. Such things
will probably get introduced somewhere sooner or later, but
my guess is that maybe News or HTTP will be sooner, and Mail
will be later.

Regards, Martin.

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:37 EDT