Re: UTF-8 encoded texts on the website (was Zip vs. Non Zipped and ISO 15924 draft fixes)

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Fri May 21 2004 - 14:09:25 CDT

  • Next message: jcowan@reutershealth.com: "Re: UTF-8 encoded texts on the website (was Zip vs. Non Zipped and ISO 15924 draft fixes)"

    From: <jcowan@reutershealth.com>
    > Jon Hanna scripsit:
    >
    > > [T]he default encoding on the server (which really should be utf-8
    > > on www.unicode.org at this stage).
    >
    > Currently it is, but there are sticky issues: in particular, a default
    encoding
    > overrides information in HTML meta elements as well as browser heuristics,
    > at least for modern browsers.
    >
    > Consequently, random pages that happen to be in non-Unicode charsets are
    > getting mis-served and mis-displayed. The site will probably revert to
    > having no default as a result, which is a great pity.
    >
    > Talk to Sarasvati if you have a better idea.

    You can instruct Apache to serve a part of the site with another default
    encoding by uploading with your FTP client a .htaccess file containing a
    different default MIME type association.

    What I did in another website was to name plain-text files coded with UTF-8 with
    a ".UTF-8.txt" double extension, and I mapped that double extension to
    "text/plain; charset=UTF-8", and set it in the default config file for Apache.

    This way, there is no more need to create .htaccess files throughout the site,
    and visitors also have an explicit-clue (in the filename) which charset to
    select if the browser ignores both the "Content-Type:" header and the leading
    UTF-8 BOM and the <?xml charset> declaration and the <meta> tag in the HTML
    <head> section (lots of alternatives to specify it: which browser will ignore
    all these?)!



    This archive was generated by hypermail 2.1.5 : Fri May 21 2004 - 14:10:17 CDT