Re: Problem with SSI and BOM

From: Doug Ewell (
Date: Sun Sep 24 2006 - 17:02:13 CST

  • Next message: Doug Ewell: "Re: Unicode & space in programming & l10n"

    Addison Phillips <addison at yahoo dash inc dot com> wrote:

    > The BOM is often rendered in the page, throwing off other display
    > elements. One common problem on Windows is the prevalence of editors
    > (Notepad!!) that add the UTF-8 BOM to text files stored as "UTF-8".
    > While one might expect this to act as a "no-op" character, in
    > practice, it isn't.

    It should, though. A process that claims to be able to "support
    Unicode" should at least be able to follow the simple rule, "If the file
    or stream starts with EF BB BF, throw them away and treat the remainder
    of the file or stream as UTF-8."

    Even the W3C FAQ says: "In some browsers, the presence of a UTF-8
    signature will cause the browser to interpret the text as UTF-8
    regardless of any character encoding declarations to the contrary."
    That's exactly what it should do.

    The argument about accidentally throwing away a U+FEFF that was intended
    as a ZWNBSP is becoming increasingly irrelevant; U+2060 has been
    recommended over ZWNBSP for over 4 years now, and few applications used
    ZWNBSP anyway.

    Doug Ewell
    Fullerton, California, USA
    RFC 4645  *  UTN #14

    This archive was generated by hypermail 2.1.5 : Sun Sep 24 2006 - 17:16:50 CST