Re: browsers and unicode surrogates

From: Tex Texin (
Date: Mon Apr 22 2002 - 11:20:46 EDT

Jim, thanks for all the info.
I would prefer to not clutter filenames with encodings and locales, and
the remainder I need to coordinate with my ISP. I'll talk to them and
see what they let me do.


"James H. Cloos Jr." wrote:
> >>>>> "Tex" == Tex Texin <> writes:
> Tex> I am surprised by the "must only be used". It seems I am not
> Tex> conforming by including a meta statement in the utf-16 HTML
> Tex> page. I should either remove the statement or encode the HTML up
> Tex> to and including that statement as ascii. I'll check on this.
> Since you are using apache, it is quite easy to get the extra headers
> sent at the protocol level rather than having to use meta tags.
> You can use a Header directive in an .htaccess file a la:
> <Files foobar.html>
> Header set Content-Language en-US
> Header set Content-Type text/html; charset=UTF-8
> </Files>
> Or, you can use mod_cern_meta to put the extra headers in a
> foo.html.meta file. (The actual filename suffix can be set in the
> .htaccess file or the main server conf files.)
> There are other ways as well. Apache will already (if you use the
> default configs) add the Content-Language header if you use a filename
> like foo.en.html. You could have it also add the charset via a
> similar mechanism. Something like:
> AddCharset UTF-8 utf8
> will make foobar.en.utf-8.html send the headers:
> Content-Language: en
> Content-Type: text/html; charset=UTF-8
> given the default configs for language and type extensions.
> Hmmm. Looking at a recent install of SuSE, using their apache rpm,
> .utf8 is already configured as an extension to set charset=UTF8, so
> you could try just renaming the file to eg:
> to set the charset. You'd have to add your own AddCharset directives
> for UTF-16 and UTF-32.
> -JimC

Tex Texin                    Director, International Business    the Progress Company
Tel: +1-781-280-4271
"The world writes in my database!" Progress Exchange 2002
Globalization Empowerment for Progress users
A compelling demonstration for Unicode:

This archive was generated by hypermail 2.1.2 : Mon Apr 22 2002 - 12:00:47 EDT