Re: encoding sniffing

From: Patrick Andries (
Date: Mon Jul 14 2003 - 17:42:06 EDT

  • Next message: Peter Kirk: "Re: [Private Use Area] Audio Description, Subtitle, Signing"

    ----- Message d'origine -----
    De: "Philippe Verdy" <>

    > On Monday, July 14, 2003 10:14 PM,
    <> wrote:
    > > Are there any libraries out there (open-source or otherwise) that can
    > > be used to detect the character encoding of a file or data stream?
    > Yes, but these libraries actually try to detect the actual encoded
    > language, based on strict validity rules to discriminate first the
    > possible encodings, then statistic rules to try matching the
    > languages with their various encoded byte sequences, then with
    > the help of common words.

    I know one such library ( and
    it does not use a three-step approach as you outline it above, but a single

    In any case, I believe Peter has an idea how these libraries work and their
    limitations, he is rather looking for one with its limitations.

    P. Andries
    - o - 0 - o -
    Textes Unicode en franšais

    This archive was generated by hypermail 2.1.5 : Mon Jul 14 2003 - 18:22:29 EDT