From: Nelson H. F. Beebe (
Date: Sat May 10 2003 - 08:30:21 EDT

    Ben Dougall <> asks on 10 May 2003 10:56 about the
    problem of recognizing the encoding of untagged plain text.

    This note is to point out that considerable work has already been done
    on this in the MULE (multi-lingual emacs) extensions to the GNU Emacs
    text editor; they are available in the separate leim-x.y.z
    distribution at


    since 17-Sep-1997 for emacs-20.1 and later. Simply put, if in an
    otherwise-empty directory, you do

            tar xfz emacs-x.y.z.tar.gz
            tar xfz leim-x.y.z.tar.gz
            cd emacs-x.y.z
            ./configure && make all check install

    you'll get an emacs with MULE support.

    Visiting a text file causes emacs to apply heuristics to the visited
    text to guess an encoding. Should it guess incorrectly, the human
    user can then change the encoding with a few keystrokes or menu

    All of the source code is available for study.

    GNU emacs builds on virtually any UNIX platform, except Mac OS X:
    Apple distributes a version without X11 support, but seems not to have
    returned their changes to the emacs developers. There is also a
    version for several flavors of MS/Windows, both native, and under
    Unix-like environments that run on top of that system: see


    for details.

