RE: Usage of CP1252 characters on www.msnbc.com

From: Chris Pratley (chrispr@microsoft.com)
Date: Mon Jul 07 1997 - 22:33:26 EDT


Thanks for the response Markus.

Although a configurable option is a possible solution, we know that the
typical user (representing around 95-98% of users) never changes
defaults in a program, especially something as obscure as encoding
options. As you may know it is very popular to attack Microsoft for "UI
bloat", and this would no doubt add to that IMHO. But assuming we have
options, "which one do you default to?" is the $64000 question.

If you did have options, you could label the options you list as:
a) compatible with 1997 browsers and later
b) compatible with 1997 browsers and later
c) modify contents of document to be readable in all browsers.
Warning: some contents may appear different from your original document

        Now, if your competitor offered this option:
d) Compatible with all browsers used _in your company_
you would have a hard time competing. (Note the emphasis on "in your
company" in the fourth option, meaning the customer's company. You could
even go on to say "most browsers on the Internet", but that got me in
trouble last time :-))

Erik raised an option of writing the actual byte value of the characters
in the file. It was my understanding that this can cause trouble in some
Unix servers that are not expecting byte vales in the 0x80-0x9F range.
Can someone comment here?

Chris

        -----Original Message-----
        From: Unicode Discussion [SMTP:unicode@unicode.org]
        Sent: Monday, July 07, 1997 6:47 PM
        To: Multiple Recipients of
        Subject: Re: Usage of CP1252 characters on www.msnbc.com

        Chris Pratley wrote on 1997-07-08 00:29 UTC:
> Do you (or anyone else), have some suggestions on this issue?
I think it
> is a hard problem to solve, and I was trying to get a sense of
what
> solutions people were adopting.

        In the Unix world, in such situations we make things
configurable. I am
        not familiar with the various Microsoft products that produce
HTML files,
        but I would expect quality software to allow me to switch
between the
        following alternatives when I convert a CP1252 based file into
HTML
        in some export filter:

          - convert to Unicode NCR
          - convert to Unicode UTF-8
          - transscribe down to ISO 8859-1 (i.e. replace `smartquotes'
by quotes)

        and if it is really necessary for the existing installation
base, then
        I might also offer the following together with a warning in big
red
        blinking letters that it will break non-Windows systems:

          - output directly in CP1252 bytes (not NCR!) and make sure
that the
            IANA registry contains a reasonable MIME entry for CP1252
and that
            the HTTP server will announce CP1252 as the encoding

        I fully understand that Microsoft is not alone guilty and that
        Netscape created the same mess even before. [But making new
errors
        is always slightly more honorable than repeating old ones ...
;-]

        However, as you do, I also hope that Unicode support with at
least the CP1252
        characters (better even MES or more) will in the next 12 months
become
        so widely implemented that backwards compatibility of the last
option
        will not any more be that important and that then the first two
options
        above become the widely accepted default choices.

        BTW: I just got a reply from MSNBC on my letter:

          Thank you for writing. The lack or change of punctuation you
describe
          in viewing our site with Netscape is due to the way our web
editor sees
          HTML code. Without getting technical, we have had to
substitute
          standard HTML code that represents the apostrophes and other
punctuation
          marks with a slightly different version of the code. This is
definitely
          a bug in our web editor, and we are working hard on a
permanent fix.
          Your patience in this matter is appreciated.

          MSNBC Customer Support

        Markus

        --
        Markus G. Kuhn, Computer Science grad student, Purdue
        University, Indiana, USA -- email: kuhn@cs.purdue.edu



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:35 EDT