2012-07-26 0:19, Steven Atreju wrote:
>    |
>
> And that was an Unicode BOM that has been converted to UTF-8 and
> then been converted to UTF-8 once again.
Apparently the problem is that the data has been doubly encoded: first 
into UTF-8, then interpreting the bytes of UTF-8 data, interpreting them 
as if they were in windows-1252, and then UTF-8 encoding the resulting 
characters. This is of course very incorrect, and not uncommon.
>    |vielen Dank für Ihre E-Mail.
So the letter “ü” was munged too, and presumably all non-ASCII data. So 
this is not an argument against using BOM in UTF-8. The BOM was a victim 
of incorrect processing, like everyone else (outside ASCII). One might 
even argue that the BOM is useful here, too, since it immediately 
signals that there is something wrong, and “” is an encoding error 
signature, so to say.
Yucca
Received on Wed Jul 25 2012 - 16:48:11 CDT
This archive was generated by hypermail 2.2.0 : Wed Jul 25 2012 - 16:48:12 CDT