Re: suggestions for strategy on dealing with plain text ...

From: Nelson H. F. Beebe (beebe@math.utah.edu)
Date: Sat May 10 2003 - 08:30:21 EDT

  • Next message: Bob_Hallissy@sil.org: "RE: suggestions for strategy on dealing with plain text in potentially any (unspecified) encoding?"

    Ben Dougall <bend@freenet.co.uk> asks on 10 May 2003 10:56 about the
    problem of recognizing the encoding of untagged plain text.

    This note is to point out that considerable work has already been done
    on this in the MULE (multi-lingual emacs) extensions to the GNU Emacs
    text editor; they are available in the separate leim-x.y.z
    distribution at

            ftp://ftp.gnu.org/gnu/emacs/

    since 17-Sep-1997 for emacs-20.1 and later. Simply put, if in an
    otherwise-empty directory, you do

            tar xfz emacs-x.y.z.tar.gz
            tar xfz leim-x.y.z.tar.gz
            cd emacs-x.y.z
            ./configure && make all check install

    you'll get an emacs with MULE support.

    Visiting a text file causes emacs to apply heuristics to the visited
    text to guess an encoding. Should it guess incorrectly, the human
    user can then change the encoding with a few keystrokes or menu
    selections.

    All of the source code is available for study.

    GNU emacs builds on virtually any UNIX platform, except Mac OS X:
    Apple distributes a version without X11 support, but seems not to have
    returned their changes to the emacs developers. There is also a
    version for several flavors of MS/Windows, both native, and under
    Unix-like environments that run on top of that system: see

            http://www.math.utah.edu/~beebe/gnu-on-windows.html

    for details.

    -------------------------------------------------------------------------------
    - Nelson H. F. Beebe Tel: +1 801 581 5254 -
    - Center for Scientific Computing FAX: +1 801 581 4148 -
    - University of Utah Internet e-mail: beebe@math.utah.edu -
    - Department of Mathematics, 110 LCB beebe@acm.org beebe@computer.org -
    - 155 S 1400 E RM 233 beebe@ieee.org -
    - Salt Lake City, UT 84112-0090, USA URL: http://www.math.utah.edu/~beebe -
    -------------------------------------------------------------------------------



    This archive was generated by hypermail 2.1.5 : Sat May 10 2003 - 09:03:15 EDT