Re: 8859-1, 8859-15, 1252 and Euro (fwd)

From: Jungshik Shin (
Date: Thu Feb 10 2000 - 16:29:39 EST

On Thu, 10 Feb 2000, Robert A. Rosenberg wrote:

> At 10:11 AM 02/10/2000 -0800, Rick McGowan wrote:
> > > Does the rest of the industry ALSO use the appearance of characters
> > > in the x81-x9F codepoint range as an indication to tag as windows-1252?
> >
> >Around here we have a mail client that looks at MIME tags and recognizes a
> >lot of codesets. There's an awful lot of mail out there in the world that
> >is
> >tagged 8859-1, but actually contains Windows CP1252. I think our heuristic
> >is that if it *says* 8859-1 but contains C1 entities (x81-9f) then it must
> >be
> >CP1252... Yuck. Mailers should tag it like it is.
> I fully agree. If I remember correctly, OE will use CP-1252 as its Charset
> in this case. Lack of codes in the "extra" range get marked as ISO-8859-1
> (I think). If it is smart enough to mark as US-ASCII (in the absence of
> High ASCII), it should be able to monitor for x80-x9F too.

    Is it? I always get emails of PURE US-ASCII but marked as in
ISO-8859-1(or whatever superset of US-ASCII, ISO-8859-x, ISO-2022-JP,
EUC-KR, etc) from MS OE 4.x,5.x, Netscape 4.x,Eudora users. (it's not
a violation of any standard, but I can't help feeling labelling mesages
of pure US-ASCII as in any superset is not desirable. Look at what Pine
and other decent terminal based Unix mail clients do in that case.)
WinEudora and Netscape 4.x are worse than that in that they MISlabels
Windows-(12xx|949) as ISO8859-x|EUC-KR. (Windows-949 is MS's proprietary
extension of EUC-KR for Korean Windows). BTW, MS OE 4.x/5.x is NOT
free from the sin of MISlabelling Windows-949 as EUC-KR(or even worse
NON-standard ks_c_5601-1987 of their own invention).

    Jungshik Shin

