Re: UTF-8 display (was: Re: a mug)

From: philip chastney <>
Date: Tue, 21 Jul 2015 05:49:52 -0700

so the webmaster put up the page, declaring the charset to be UTF-8...

but what charset was being used by the guy who knocked out the HTML?

it could be more complicated than that: maybe the page was produced using UTF-8,
somebody reads the page using, say, WIndows 1252, and "converts" it to UTF-8

I'm sure, with a little effort, ever more complicated scenarii could be constructed
-- it's amazing what can be achieved when arrogance and ignorance are combined


On Tue, 21/7/15, Marcel Schneider <> wrote:

 Subject: UTF-8 display (was: Re: a mug)
 To: "UmeshPN" <>, "DanielBünzli" <>
 Cc: "UnicodeMailingList" <>
 Date: Tuesday, 21 July, 2015, 8:46 AM
 On 13 Jul 2015, at
 11:28, I wrote:
> The only time I saw UTF-8
 like on the T-shirt, was when opening UTF-8 files that
 didn't specify charset=UTF-8. The thing to do was to add
 the charset in the file header.
 Now I see that this issue is
 much more tricky. I've just stumbled over a no-display
 page instead of (or at the URL of)
 where I read:
 Our apologies…
 while the source as displayed
 by Firefox shows:
 Our apologies
 (The markup comes from the header 1 tags.)
 The trick is that the real HTML
 file as saved by Zotero contains:
 Our apologies…
 (with a U+2026)
 and is encoded in...
 Once changed this to utf-8, the
 page displays correctly:
 Our apologies…
 This may be why people are
 puzzled with UTF-8 up to the end we've seen.
 So I would like to present my
 apologies to the List, and ask if anyone would help us to
 know the real problem (browsers, web editors, or else) and
 how to fix it. I don't think it's a mere HTML issue,
 as it concerns the Unicode Transformation Format.
 Best regards,
Received on Tue Jul 21 2015 - 07:51:09 CDT

This archive was generated by hypermail 2.2.0 : Tue Jul 21 2015 - 07:51:10 CDT