Re: Frequent incorrect guesses by the charset autodetection in IE7

From: Otto Stolz (Otto.Stolz@uni-konstanz.de)
Date: Sat Aug 19 2006 - 03:43:39 CDT

Next message: Otto Stolz: "Re: Frequent incorrect guesses by the charset autodetection in IE7"

Previous message: Dean Harding: "RE: Frequent incorrect guesses by the charset autodetection in IE7"
In reply to: Sinnathurai Srivas: "Re: Frequent incorrect guesses by the charset autodetection in IE7"
Next in thread: Otto Stolz: "Re: Frequent incorrect guesses by the charset autodetection in IE7"
Reply: Otto Stolz: "Re: Frequent incorrect guesses by the charset autodetection in IE7"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Hello Sinnathurai Srivas,

you have written:
> please see the image. I'll appreciate if any one
> willing to come out and help me document these problems.
> http://www.araichchi.net/kanini/unicode/fail/u-photoplus-fails.jpg
> http://www.araichchi.net/kanini/unicode/fail/unicode_status.htm
>
> Look for question mark.
> Look for where the question mark starts, despite the use of Unicode fonts.

In order to get help on your problems, you should explain what you
have done and what you have expected. At the moment, we see only an
image that displays fine on any browser. In this case we would
need, at least, a link to the WWW page that you have displayed to
produce that screen-shot.

However, I have inspected both your HTML code in
<http://www.araichchi.net/kanini/unicode/fail/unicode_status.htm>
and the pertinent HTTP headers from your server, and I have found
no charset declaration. If you present your UTF-encoded pages
in the same way, they will of course not display correctly.

To display a page correctly, the browser must, of course, know
its encoding; and it is the author's and the server's duty to
inform th browser about it -- be it ISCII, ISO 8859, UTF, or
whatever. If there is no explicit declaration of the encoding,
the browser must assume ASCII (ISO 646 IRV) for HTML 3, and
ISO 8859-1 for HTML 4 (and above). This is what the Internet
standards say.

To use Unicode, e. g. the UTF-8 encoding, in a WWW page, you will
have to:
- store your HTML source in UTF-8 encoding,
- insert the following HTML header:
   <meta http-equiv="content-type" content="text/html; charset=utf-8">
- configure your server to emit the following HTTP header:
   Content-Type: text/xml; charset=utf-8
And your audience must, of course,
- have suitable fonts installed, capable of rendering all the
   characters you are using in your page,
- let their browsers just follow the advice from the HTTP headers,
   and not try a different encoding.

The HTML meta tag will inform the browser in case the user stores
that page locally and displays it later (without the aid of your
server). The other four requirements are essential for the normal
situation, when a browser displays a remote page.

In contrast, your
<http://www.araichchi.net/kanini/unicode/fail/unicode_status.htm>
has no HTML doctype declaration, so the browser assumes HTML 4.01
transitional, cf. <http://validator.w3.org/docs/help.html#faq-doctype>.
The HTTP headers for that page specify
> Content-Type: text/html
So the browser must assume "charset=iso-8859-1". If your test
page that displays as u-photoplus-fails.jpg was served in
the same way, no browser has ever regarded it as Unicode-encoded.

I am glad that the discussion now turns away from scolding and finds
its way to solid technical grounds. I am quite confident that most,
or all, of your complaints can be solved if you just take the pains
- to study, and comply with, the necessary standards and rules,
- and, if you hit on an insurmounbtable problem, to describe it in
detail.
A good starting point for your studies is <http://www.unicode.org/faq/>.

Best wishes,
Otto Stolz

Next message: Otto Stolz: "Re: Frequent incorrect guesses by the charset autodetection in IE7"
Previous message: Dean Harding: "RE: Frequent incorrect guesses by the charset autodetection in IE7"
In reply to: Sinnathurai Srivas: "Re: Frequent incorrect guesses by the charset autodetection in IE7"
Next in thread: Otto Stolz: "Re: Frequent incorrect guesses by the charset autodetection in IE7"
Reply: Otto Stolz: "Re: Frequent incorrect guesses by the charset autodetection in IE7"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Wed Jul 19 2006 - 03:52:32 CDT