Re: (off topic) More bad conversions, was: Designing a multilingual web site

From: addison@inter-locale.com
Date: Wed Jul 19 2000 - 14:43:47 EDT


Outlook is a pretty typical Windows application and a pretty typical Mail
client. It is usually pretty good at reading the MIME tags on a message
and assigning the appropriate character set to the message.

Since the message you received apparently had no such tag, it was treated
as ASCII.

This is the problem for Notepad. Your "ASCII" is treated, I believe, like
the "DOS" code page in Win2K (aka the OEM code page). This results in a
character set conversion in Notepad from code page 437 (presuming that
your're running US Windows) or code page 850 (if you're running a
different Western European Windows) to CP1252, scrambling the Arabic
characters pretty thoroughly. Nothing you do in Notepad after that will
work.

Note that Outlook should be able to display your message itself without
resorting to these gymnastics (although I can't verify it: I don't use
Outlook any more either).

Regards,

Addison

=======================================================
Addison P. Phillips Principal Consultant
Inter-Locale LLC http://www.inter-locale.com
Globalization Engineering & Consulting Services

+1 408.210.3569 (mobile) +1 408.904.4762 (fax)
=======================================================

On Tue, 18 Jul 2000, Michael (michka) Kaplan wrote:

> From: "Munzir Taha" <munzir_taha@yahoo.com>
> >From: "Michael (michka) Kaplan" <michka@trigeminal.com>
> > >You should explicitly set the encoding in the header of your page, and
> not
> > >leave it for the browser to guess. The following should go all in one
> line
> > >at the very top of the header:
> > <META HTTP-EQUIV="Content-Type" CONTENT="text/html; CHARSET=UTF-8">
>
> > Yes, I understand the point of putting the header in each page explicitly.
> > But my question is how did the browser guessed it?
>
> Browsers sometimes guess, especially IE... you asked it to AutoDetect....
> with a feature that has a name like THAT it has to guess right sometimes,
> right? :-)
>
> > Another question: I received a message in Arabic thru Outlook 2000, It
> > doesn't appear right until I changed the encoding to Arabic (Windows). I
> > copied the (garbage) text (Which is encoded US-ASCII) and paste it to
> > notepad in Win2k. I saved the file into all available formats and renamed
> > each file to .htm. I then tried the different encodings but no use, the
> > garbage text doesn't changed at all.
> >
> > I then went again to outlook, changed the encoding of the message into
> > Western European (Windows), saved it as Ansi text, renamed it into .htm,
> > changed the encoding to Arabic (Windows) and it's OK. Can you please
> explain
> > to me why the first failed whereas the second succeeded?
>
> I think we are really getting way too off the topic of THIS group here, and
> this thead is wandering a little out of control. To your question above, I
> can honestly say I have no earth clue as I do not use Outlook, at all.
>
> michka
>
>



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:05 EDT