UTF-8 display (was: Re: a mug)

From: Marcel Schneider <charupdate_at_orange.fr>
Date: Tue, 21 Jul 2015 10:46:33 +0200 (CEST)

On 13 Jul 2015, at 11:28, I wrote:

> The only time I saw UTF-8 like on the T-shirt, was when opening UTF-8 files that didn't specify charset=UTF-8. The thing to do was to add the charset in the file header.

Now I see that this issue is much more tricky. I've just stumbled over a no-display page instead of (or at the URL of) http://www-01.ibm.com/software/globalization/topics/keyboards/physical.jsp where I read:
Our apologies…
while the source as displayed by Firefox shows:
charset=utf-8

Our apologies
(The markup comes from the header 1 tags.)

The trick is that the real HTML file as saved by Zotero contains:

Our apologies…
(with a U+2026)
and is encoded in...
charset=windows-1252

Once changed this to utf-8, the page displays correctly:
Our apologies…

This may be why people are puzzled with UTF-8 up to the end we've seen.

So I would like to present my apologies to the List, and ask if anyone would help us to know the real problem (browsers, web editors, or else) and how to fix it. I don't think it's a mere HTML issue, as it concerns the Unicode Transformation Format.

Best regards,

Marcel
Received on Tue Jul 21 2015 - 03:48:00 CDT

This archive was generated by hypermail 2.2.0 : Tue Jul 21 2015 - 03:48:00 CDT