Re: € sign encoding mix??

From: Markus Scherer
Date: Fri Mar 17 2006

    Just a guess: U+20AC (Euro) got converted to windows-1252 0x80 (Euro),
    mis-interpreted as U+0080 (C1 control), and re-encoded in UTF-8.

    windows-1252 is a sort of superset of ISO 8859-1 in that it replaces
    most C1 control codes with various useful graphic characters like the
    Euro sign and curly quotes.


    > In the Unicode-Charts a &#8364; (Euro-sign) has the hex20 AC.
    > If I convert this into UTF-8 I get a hexE2 82 AC.
    > Now the receiving System gets this, handles this as a UTF-8 encoding, but can't display the character.
    > What the receiving System can display is a hexC2 80, which is the UTF-8 encoded form of hex80.

