RE: DEC multilingual code page, ISO 8859-1, etc.

From: Chris Pratley (chrispr@MICROSOFT.com)
Date: Fri Mar 24 2000 - 06:20:48 EST


Doug, although I was no where near Microsoft when the decision to define
Windows-1252 was taken, I think your paragraph is good:

No doubt MS added the extra graphics characters to the 0x80-0x9F range
without any worries about conflicting with the C1 range, since few
people dreamed in the mid-1980s that one day PCs would all be connected
to the Internet and would be exchanging data with ISO 2022-compliant
systems.

When I played around on Atari400/800 and Apple II computers in the late 70s
early 80s, those all had their own character sets just like DOS did. It
doesn't seem to bother anyone now that those vendors made up their owns sets
of characters. I remember at that time no one could share files because you
couldn't even read each other's disks let alone character encodings, and
modems were 300baud and only used by the very few. Sharing data across
computer systems was just not part of the design spec. And if we had the
benefit of hindsight, there are a bunch more things I would fix before the
conflict of windows codepages with iso-2022-jp in plain-text data transfer.

BTW, with IE and Office, we try to support the text and HTML encodings used
on Windows, DOS, Mac, Unix, and even EBCDIC (no, we don't handle Atari or
TRS-80 yet nor will we) but I see a lot of complaints from Unix users on
this list about not being able to read HTML pages encoded in windows-1252
when the characters in the 80-CF range are used. I'm curious why the makers
of whatever browsers these are don't simply add support for non-ISO
encodings like windows-1252 and be done with it (whether windows-1252 is
registered at the glacial IANA or not shouldn't matter - we tried to
register windows-1252 there for years with no response, yet the missing
registration is claimed to be the fault of Microsoft. Bizarre). Isn't it
fairly trivial and also worthwhile to support this encoding? I'm genuinely
curious, so no flames please.

Chris Pratley
Group Program Manager
Microsoft Word

-----Original Message-----
From: Doug Ewell [mailto:dewell@compuserve.com]
Sent: Thursday, March 23, 2000 9:54 PM
To: Unicode List
Subject: Re: DEC multilingual code page, ISO 8859-1, etc.

Edwin F. Hart <edwin.hart@jhuapl.edu> wrote:

> Actually, there were 3 very similar proposals in the draft stage and
> the 3 differed in only a few character assignments:
>
> ECMA 94, ISO 8859-1, and ANSI dp 131.2 (or 132.2)
>
> I heard that representatives from the 3 organizations developed a
> compromise proposal that became ECMA 94 and ISO 8859-1. The ANSI
> proposal was never progressed beyond the draft stage and in the early
> 1990s, ANSI/ISO 8859-1 was adopted instead as the "8-bit ASCII"
> standard.
>
> The DEC multilingual page is likely based on one of the 3 drafts.
> However, DEC had to commit the design before the compromise was
> reached and final standards were approved.

That sounds very much like the story I read concerning Microsoft. MS
adopted the ANSI standard (possibly only a draft standard at the time)
for Windows before it was approved as an ISO standard. I have no
trouble believing there were changes or a compromise; that does not
change what MS did.

No doubt MS added the extra graphics characters to the 0x80-0x9F range
without any worries about conflicting with the C1 range, since few
people dreamed in the mid-1980s that one day PCs would all be connected
to the Internet and would be exchanging data with ISO 2022-compliant
systems.

I would be interested to hear what Chris, Murray, or any other Microsoft
people can add to this.

-Doug Ewell
 Fullerton, California



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:00 EDT