Re: 8859-1, 8859-15, 1252 and Euro

From: Robert A. Rosenberg (
Date: Thu Feb 10 2000 - 11:44:52 EST

At 11:00 AM 02/10/2000 -0500, Frank da Cruz wrote:
> > At 02:56 PM 02/07/2000 -0800, A. Vine wrote:
> > >Tim Greenwood wrote:
> > > >
> > > > Pretty much all of the pages on the web, and the browsers, ignore the
> > > > differences between ISO-8859-1 and Windows code page 1252.
> > >
> > >I wish they would! I'm pretty sick of seeing question marks where there
> > >should be quotes, apostrophes, bullets, em-dashes, etc.
> >
> > The real [short term] solution is to have a preference switch that says
> > "Treat ISO-8859-1 as Windows-1252" so that the "undefined" (x80-x9F) range
> > maps to the Windows-1252 characters. Also users should send some
> > "clue-by-four" error messages to web sites that do not show the character
> > set as windows-1252 instead of ISO-8859-1 when using this character range
> > (ie: Show the CORRECT Character Set). IMO - A BUG REPORT to ADOBE and MS
> > for their Web Design products to say that use of these characters should
> > FORCE windows-1252 into the HTML is not out of line.
> >
>The use of Windows code pages in data communications protocols is bad, bad,
>bad. Here we are saying "well, CP-1252 is just like Latin-1 except it
>includes the extra characters we need for our documents so we'll use it
>instead of Latin-1 but call it Latin-1

Please REREAD my comment. I said to STATE that the charset is
CP-1252/Windows-1252 if ANY x80-x9F codepoint is used (not pretend that it
is still pure ISO-8859-1).

>because really it's a fixed version
>of Latin-1". But then we start thinking we can do the same thing with (for
>example) Latin-2 and CP1250. Which would be a bad mistake, because the two
>are not alike at all. Ditto for Latin/Cyrillic and CP1251. In general,
>Windows code pages are NOT just "extended" ISO 8859-x's.

The same applies here. Use the Windows-125x charset designation if that is
what is actually being used (ie: there are x80-x9F characters). Unless you
can show where there is a mismatch in the x00-x7F and/or xA0-xFF
glyphs/characters between ISO-8859-1 and Windows-1252 (or one of the other
125x sets and the corresponding 8859 set) I fail to see why "Windows code
pages are NOT just "extended" ISO 8859-x's" (just as ISO-8859-1 is an
extended version of USASCII). How is a file created on a Windows machine
(in CP1252) not a valid Latin1 file so long as it does not contain the
extra 32 characters/glyphs that MS added in the x80-x9F codepoint range?

>- Frank

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:58 EDT