Re: Several BOMs in the same file

From: Pim Blokland (pblokland@planet.nl)
Date: Tue Mar 25 2003 - 08:46:13 EST

  • Next message: Doug Ewell: "Re: Several BOMs in the same file"

    Marco Cimarosti schreef:

    > > Is this in accordance with the Unicode standard, or do I have
    > > to remove the second BOM?
    >
    > IMHO, Unicode should not specify such a behavior. Deciding what a
    shell

    IMHO, it should. The guideline that says a text file can have a
    U+FEFF at the beginning, but it really shouldn't have U+FEFFs
    elsewhere, implies that a second BOM should be removed if possible.
    Of course this should be done only if the operating system knows the
    files are text files, either implicitly by checking the file types,
    or by the user manually forcing the OS to treat them as such.
    In that case, removing the BOM that would end up somewhere in the
    middle is the natural thing to do, just as removing the EOF marker
    at the end of the first file is.
    I'm not going into the implementation part; just pointing out that
    this issue is not something an operating system can ignore.

    Pim Blokland



    This archive was generated by hypermail 2.1.5 : Tue Mar 25 2003 - 09:40:47 EST