Re: Several BOMs in the same file

From: Chris Jacobs (c.t.m.jacobs@hccnet.nl)
Date: Sun Mar 23 2003 - 10:26:57 EST

  • Next message: Eric Rasmussen: "Re: CJK question"

    ----- Original Message -----
    From: "Pim Blokland" <pblokland@planet.nl>
    To: "Unicode List" <unicode@unicode.org>
    Sent: Sunday, March 23, 2003 2:43 PM
    Subject: Re: Several BOMs in the same file

    [ ... ]

    > But now you've got me wondering whether there are any rules or
    > guidelines for the situation where two files are joined, and the
    > second one has a BOM, but the first one hasn't. Should the resulting
    > file have a BOM? I.E. should a BOM be added to what was the contents
    > of the first file?

    In that case you should seriously consider the possibility that the byte
    order for both files is different!

    Suppose the first file is UTF-16BE without BOM and the second is UTF-16
    little endian with BOM

    A concatenation routine which does interpret the encodings and encodes the
    output afresh handles the BOM's automatically correct. On interpreting BOM's
    are removed and on freshly encoding a new BOM may appear at the start of the
    file buth never in the middle.

    But what are the conformancy requirements for concatenation routines? Are
    they required to grok the encodings?

    > Pim Blokland
    >
    >



    This archive was generated by hypermail 2.1.5 : Sun Mar 23 2003 - 10:59:24 EST