From: Philippe Verdy (email@example.com)
Date: Thu May 29 2003 - 06:11:55 EDT
From: "Tom Gewecke" <firstname.lastname@example.org>
> I wonder about this. The Unicode FAQ makes the point that some browsers
> will not display NCR's unless the charset is UTF-8. It does seem logical
> that, NCR's or not, a page with the logo should be in one of the three
> standard Unicode Encoding Forms, UTF-8, 16, or 32.
The Unicode FAQ could have said also that the reverse is also true: there are still (even more) browsers that do not display UTF-8 correctly, but accept Numeric Character References and accept them correctly as designating Unicode codepoints.
I got more reports notably from Chinese, Korean, and Japanese users, who still use very often a browser that supports some form their national encoding (SJIS, GB2312, Big5, KSC5601), sometimes with ISO2022-* but shamely do not decode UTF-8 properly (even when the page is correctly labelled, because their browser does not switch automatically the encoding when the page is loaded). This case occurs even when the encoding is specified with a HTTP Content-Type header, or with a HTML header element.
So for now, it's simply easier to use UTF-8 when designing the pages, and then save them into ISO-8859-1 (using NCRs). I admit this is troublesome, but the same browsers really know how to use Unicode codepoints and even know UTF-8, but refuse to switch to it because they do not interpret the meta information that both the page content and the HTTP header specify! I have found that these browsers simply do not recognize ANY encoding markup or meta-data and always use the user setting (which is stupid in that case, unless the page was incorrectly labelled).
This archive was generated by hypermail 2.1.5 : Thu May 29 2003 - 07:01:52 EDT