Re: suggestions for strategy on dealing with plain text ...

From: Nelson H. F. Beebe (
Date: Sat May 10 2003 - 08:30:21 EDT

  • Next message: "RE: suggestions for strategy on dealing with plain text in potentially any (unspecified) encoding?"

    Ben Dougall <> asks on 10 May 2003 10:56 about the
    problem of recognizing the encoding of untagged plain text.

    This note is to point out that considerable work has already been done
    on this in the MULE (multi-lingual emacs) extensions to the GNU Emacs
    text editor; they are available in the separate leim-x.y.z
    distribution at


    since 17-Sep-1997 for emacs-20.1 and later. Simply put, if in an
    otherwise-empty directory, you do

            tar xfz emacs-x.y.z.tar.gz
            tar xfz leim-x.y.z.tar.gz
            cd emacs-x.y.z
            ./configure && make all check install

    you'll get an emacs with MULE support.

    Visiting a text file causes emacs to apply heuristics to the visited
    text to guess an encoding. Should it guess incorrectly, the human
    user can then change the encoding with a few keystrokes or menu

    All of the source code is available for study.

    GNU emacs builds on virtually any UNIX platform, except Mac OS X:
    Apple distributes a version without X11 support, but seems not to have
    returned their changes to the emacs developers. There is also a
    version for several flavors of MS/Windows, both native, and under
    Unix-like environments that run on top of that system: see


    for details.

    - Nelson H. F. Beebe Tel: +1 801 581 5254 -
    - Center for Scientific Computing FAX: +1 801 581 4148 -
    - University of Utah Internet e-mail: -
    - Department of Mathematics, 110 LCB -
    - 155 S 1400 E RM 233 -
    - Salt Lake City, UT 84112-0090, USA URL: -

    This archive was generated by hypermail 2.1.5 : Sat May 10 2003 - 09:03:15 EDT