Re: Subject: Re: 32'nd bit & UTF-8

From: Marcin 'Qrczak' Kowalczyk (
Date: Wed Jan 19 2005 - 14:09:41 CST

  • Next message: Marcin 'Qrczak' Kowalczyk: "Re: 32'nd bit & UTF-8"

    "Oliver Christ" <> writes:

    > On the very contrary. It's most helpful to determine a text file's
    > encoding. Without the UTF8 BOM it's hard to tell whether a file is
    > encoded in some ISO or whatever encoding/codepage or is already UTF8.

    The problem with BOM in UTF8 is that it must be specially handled by
    all applications. It effectively turns UTF-8 into a stateful encoding
    where the beginning of a "text stream" must be treated specially.
    World would be simpler if UTF-8 BOM was banned.

    Fortunately I have never met a Unix program which used a UTF-8 BOM,
    so I can mostly ignore the issue, except that text files coming from
    Windows may have that annoying thing at the beginning which must be

       __("<         Marcin Kowalczyk

    This archive was generated by hypermail 2.1.5 : Wed Jan 19 2005 - 14:10:39 CST