Re: Character set conversion question

From: Leo Broukhis (leob@mailcom.com)
Date: Wed Jun 17 2009 - 00:02:50 CDT

  • Next message: Dreiheller, Albrecht: "RE: Character set conversion question"

    That's exactly my question: how to organize the brute force approach
    with 217 character maps in /usr/share/i18n/charmaps/ ?

    Leo

    On Tue, Jun 16, 2009 at 3:38 PM, Bjoern Hoehrmann<derhoermi@gmx.net> wrote:
    > * Leo Broukhis wrote:
    >>What would be a way to find out what character set conversions were
    >>applied to the text?
    >
    > Where the brute force approach fails and you have not misanalyzed the
    > byte stream (copy and paste from a mail program may be unreliable) it
    > is likely that you either have not tried enough encodings, or the en-
    > coding is the result of function composition, for example, it might
    > have been ISO-8859-X which is then interpreted as ISO-8859-Y and then
    > encoded using ISO-8859-Z by some process; a popular example is UTF-8
    > encoded data re-interpreted as ISO-8859-1 and re-encoded as UTF-8.
    > Then your brute force search has to include such compositions aswell.
    > --
    > Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
    > Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de
    > 25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/
    >



    This archive was generated by hypermail 2.1.5 : Wed Jun 17 2009 - 00:06:42 CDT