RE: Several BOMs in the same file

From: Kent Karlsson (kentk@md.chalmers.se)
Date: Tue Mar 25 2003 - 06:03:36 EST

  • Next message: Kent Karlsson: "RE: Several BOMs in the same file"

    > Let's say that I have two files, namely file1 & file2, in any Unicode
    > encoding, both starting with a BOM, and I compile them into
    > one by using
    >
    > cat file1 file2 > file3

    For POSIX implementations, this concatenates the octets (bytes)
    in the two files, whether either of them is text in UTF-8, text
    in BIG-5, directly executable code for whatever instruction set,
    JPEG image, PNG image, compressed or encrypted, or any other kind
    of file, in whatever mixture. The result may be uninterpretable...

    'cat' is NOT the program that should even try to remove any "BOM"s.

    But if both files are for free-form plain text (or you otherwise
    have reason to think that the result will be sensible) in the
    SAME encoding, and BOM-FREE, or all the BOMs are really ZWNBSPs,
    that kind of con"cat"enation is fine.

                    /kent k



    This archive was generated by hypermail 2.1.5 : Tue Mar 25 2003 - 07:01:38 EST