Re: UTF-8 and Big5

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Wed May 31 2006 - 21:59:27 CDT


From: "Dean Harding" <dean.harding@dload.com.au>
> If they're just static HTML pages, you can put an override for the encoding
> in a <meta> header, like so:
>
> <html>
> <head>
> <meta http-equiv="Content-Type" value="text/html; charset=Big5" />
> ... rest of the file here ...
>
> You should put it *before* any actual big5 characters appear (though it
> should still work if you don't - the browser is supposed to start parsing
> over again once it sees that header)

Except that this is often boringto add in each edited web page, and that the user said that it would set it in server-side settings (meaning that UTF-8 would be generated in HTTP headers).

When the HTTP "Content-Type:" header specifies the charset, it overrides whatever is indicated in the HTML meta tag.

The solution for this problem is to give different file extensions on HTML pages, and use two server-side settings.

For example,
* set the default MIME type for "*.html" to "text/html" (without specifying the charset which is encoded in the document, but can also be used for ASCII only resources)
* set the alternate MIME type for "*.UTF-8.html" to "text/html; charset=UTF8"
* set the alternate MIME type for "*.Big5.html" to "text/html; charset=Big5"

The same technic can be used for external javascript and CSS files, or PHP and java pages.

This way, only the filename convention specifies the encoding to use; you may even hide these server-side local extensions in the URLs. Look at the Apache configuration documentation for example.

This is just a matter of local conventions for the conversion of server-side filenames to URLs and MIME type.



This archive was generated by hypermail 2.1.5 : Wed May 31 2006 - 22:31:43 CDT