Re: BOM's at Beginning of Web Pages? Mac IE's Euro

From: Tex Texin (tex@i18nguy.com)
Date: Mon Feb 17 2003 - 05:57:41 EST

  • Next message: Roozbeh Pournader: "Re: BOM's at Beginning of Web Pages? Mac IE's Euro"

    Hi,
    > AFAICR, there is supposed to be no single non-ASCII character before that
    > <meta> tag.

    I don't believe the standard says that. However, it is recommended that the
    META content-type statement is placed as early as possible, for exactly the
    reason that any non-ascii characters that appear earlier will be potentially
    misinterpreted.

    Actually, some user agents do autodetection and override the meta statement
    declaration because so many pages are mislabeled iso 8859-1 that are not.
    This behavior might be changing as tools that generate html are doing better
    jobs of declaring the correct charset.
    Unfortunately the spec says that http encoding declarations take precedence
    over meta statements, and these are wrong fairly often as well as document
    authors often can't control the encoding declared by their web server. (As
    that is controlled by web admins.)

    tex
    Roozbeh Pournader wrote:
    >
    > On Sun, 16 Feb 2003, Doug Ewell wrote:
    >
    > > The Unicode home page includes the following line, right where it should
    > > be, in the <head> section:
    > >
    > > <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
    > >
    > > Any User Agent that takes a page properly marked as UTF-8, as above, and
    > > still tries to autodetect a local code page, is badly misguided. How
    > > would it handle a real UTF-8-encoded euro sign (0xE2 0x82 0xAC)?
    >
    > AFAICR, there is supposed to be no single non-ASCII character before that
    > <meta> tag. I really don't like to search the specs again, but I'm sure I
    > saw it somewhere. The HTML renderer sees those characters and thinks the
    > document has already started (since the <html>, <head> and <body> tags are
    > are not mandatory in HTML 4 Transitional). So it goes into autodetection
    > mode. The same situation happens with MS FrontPage 2000 (but I've already
    > explained that).
    >
    > roozbeh

    -- 
    -------------------------------------------------------------
    Tex Texin   cell: +1 781 789 1898   mailto:Tex@XenCraft.com
    Xen Master                          http://www.i18nGuy.com
                             
    XenCraft		            http://www.XenCraft.com
    Making e-Business Work Around the World
    -------------------------------------------------------------
    


    This archive was generated by hypermail 2.1.5 : Mon Feb 17 2003 - 06:51:34 EST