    I'm looking for tips on automatically detecting text data in MS-DOS
    CP437 (or 850, etc.) versus Latin-1 or Windows CP1252. It doesn't have
    to be a perfect solution, but pretty good.

    One problem is detecting text with the MS-DOS box-drawing characters,
    many of which occupy the same code points as Latin-1 accented letters.
    This means that simple range-checking often doesn't work.

    Please send replies off-list unless you feel they would interest the
    list. Please don't tell me this is anachronistic; I know it is. I'm
    trying to migrate a lot of that anachronistic data to UTF-8, as
    automatically as possible.

