RE: DEC multilingual code page, ISO 8859-1, etc.

From: Chris Pratley (chrispr@MICROSOFT.com)
Date: Tue Mar 28 2000 - 08:40:18 EST


It's an interesting conundrum. Do we support Unicode formats but still
produce non-Unicode by default, or the reverse? Some people on this list
might perceive the latter as the bitter pill that the web needs to swallow
(i.e. we start producing UTF-8 data by default). Unfortunately millions of
users would perceive this as a "bug", and a huge one at that. Some might
even imagine it to be a conspiracy to drive everyone to upgrade to the
latest Microsoft software. Imagine that, Microsoft being accused of a
conspiracy... :)

However, I know there will be a point when enough users will have a system
that handles UTF-8 by default that we could make this switch without
upsetting too many people. The question is when? What % of users are we
willing to cut off (in some countries that do not use ASCII characters, it
would be a complete cut-off: the entire mail would be unreadable, not just
the punctuation).

As a counter-example to my own argument, with IE5 and Office2000 we did
decide to take a leap and force Unicode on users. Not many people know that
IE5 and Office2000 send URLs in UTF-8 by default. The server is expected to
assume UTF-8 if it could be UTF-8, otherwise try to use its local encoding
(IIS4 and 5 do this). We got significant complaints in Korea and Taiwan
where there were apparently a significant number of ISPs supporting local
characters in URLs by assuming the local encoding (KSC-5601or Big5) so we
had to turn UTF-8 off by default there, but in most other areas it went over
alright.

Does anyone else find it bizarre that we engage in so much internecine
warfare on this list, when the whole purpose we are on it is to further the
cause of Unicode? So much finger pointing, when the real problems are just
that things are not being implemented fast enough, and the people who get in
the way are not on the list anyway.

BTW: there is a known bug in some email programs (some Microsoft, some not)
that label win-1252 mail as iso-8859-1 by mistake. This is definitely
unacceptable and is being eradicated in new releases of Microsoft mail
clients, but there is little we can do about older releases (except *force*
them to upgrade! Ba ha ha! (evil Unicode-fanatic laugh)). For my own
protection, I will at least say that in my past mail I was referring
specifically to web pages being correctly labelled as win-1252, not mail.

Chris Pratley
Group Program Manager
Microsoft Word

-----Original Message-----
From: A. Vine [mailto:avine@eng.sun.com]
Sent: March 27, 2000 12:22 PM
To: Unicode List
Subject: Re: DEC multilingual code page, ISO 8859-1, etc.

Kevin Bracey wrote:
>
> In message <200003250043.QAA27919@unicode.org>
> Chris Pratley <chrispr@microsoft.com> wrote:
>
> > And I don?t think getting
> ^ great demonstration of your software, btw :)
> > hardcore and disabling the current browser workaround of treating #128;
> > through #159; as windows-1252 is the right way either - it is just
> > frustrating and leads to a "buggy" experience for the end-user.
>
> But Microsoft could do it. Perhaps you could use your near-monopoly to
push
> these sorts of things through? As long as Internet Explorer allows people
> to tag CP1252 as ISO-8859-1 all other browser authors will have to, or
> customers whinge that "it works on IE".
>
> Microsoft has the market position to enforce standards on the Web - people
> write their pages FOR Internet Explorer without realising that hundreds of
> other browsers exits. If IE wasn't so forgiving, the Web would be a lot
> cleaner.
>
> <cynicism> Of course, conforming to standards would just make it easier
for
> other people to write browsers. </cynicism>
>

I heartily concur. Our mail clients are putting out the charsets specified
in
RFCs, correctly labeled. Where there are int'l standards for the charset,
we
are using them. There is no reason for MS tools to generate proprietary MS
chars/positions for emails and Web pages, or at the very least it would be
nice
if the users were informed that they were using proprietary characters that
others may not be able to see, or worse, may cause severe problems for some
users.

Microsoft could use its power to enforce int'l standards. Yes, you are
doing it
with Unicode, but in the meantime, it would be nice to repair the 1252
situation, instead of encouraging it.

Andrea

--
Andrea Vine, avine@eng.sun.com, iPlanet i18n architect
...even if it requires not really a dance with the Devil, but
call it a brief shimmy with his accountant's daughter.
-- Sean Burke http://www.netadventure.net/~sburke/



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:00 EDT