From: verdy_p (verdy_p@wanadoo.fr)
Date: Mon Feb 08 2010 - 04:17:09 CST
What is the default encoding of Apache itself ?
Except for the messages that the server will generate itself for the system administrator (for example in system
logs), Apache has no defaults and will just publish the pages in the encoding that they were encoded by the web
designer. If the web designer specifies no default in its pages (or in the HTTP settings files added in the contents
repository, Apache will not generate any MIME header specifying the charset in HTTP replies.
If there are apparent defaults, they will most often come from the server plugins installed on top of Apache, or
from the underlying OS (for example in the way it encodes the local file names, if that OS specifies a codepage in
its APIs).
So browsers will be exposed with the encodings and they will have to "guess" or to use the users settings.
There's nothing to change in Apache, it's up to web designers to be careful about the design of their pages and up
to web administrators to correctly set the Apache encironment, if they want to enforce a default, and up to server
side script authors to provide a framework for their CMS that will allow web designers or authors to publish their
work or data correctly with a predictable encoding, and to provide a test framework to make sure that no necoding
errors will happen.
I still see lots of web pages that are mixing several distinct encodings on the same page, for example a form
generated by a plugin or standard script in UTF-8, and a second form generated by another plugin or stadnard script
in ISO 8859-1, plus the encoding of the page headers/footers/menus as they were created by the web deisgner. It's of
course impossible to guess or set any encoding correctly that will match all the content displayed in the SAME html
document (i.e. not in a separate frame) and users will see the U+FFFD replacement symbol in browsers.
Such errors occur when a web site has decided to change the default encoding of the general framework, but the data
coming from a CMS repository (or from a database) has not been migrated to the new encoding (UTF-8 most often), and
nothing has been made to mark the old data (ISO 8859-* most often) so that the framework will transparently
transcode it to UTF-8 on the fly.
So you can't change the defaults in an existing system without forcing some developments and tests of the
integration of the various components that are assemblied to make the existing system.
> Message du 29/01/10 18:55
> De : "Jonathan Rosenne"
> A : "'Unicode Mailing List'"
> Copie à :
> Objet : RE: FYI: Google blog on Unicode
>
>
> Don't be so haughty - nobody changes defaults without good reason and understanding what it means.
>
> Jony
>
> > -----Original Message-----
> > From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org] On
> > Behalf Of Ed Trager
> > Sent: Friday, January 29, 2010 4:29 PM
> > To: Unicode Mailing List
> > Subject: Re: FYI: Google blog on Unicode
> >
> > On Thu, Jan 28, 2010 at 11:30 PM, Curtis Clark
> > wrote:
> > > On 2010-01-28 11:16, Ed Trager wrote:
> > >>
> > >> Now I just wish that the Apache people would make UTF-8 the
> > *default*,
> > >> *out-of-the-box* encoding for the Apache web server.
> > >
> > > Hear, hear! In my experience, character-code-clueless sysadmins never
> > like
> > > to change the defaults.
> >
> > Yes - especially the sysadmins at clueless ISPs.
> >
> >
> > >
> > > --
> > > Curtis Clark http://www.csupomona.edu/~jcclark/
> > > Director, I&IT Web Development +1 909 979 6371
> > > University Web Coordinator, Cal Poly Pomona
> > >
> > >
> >
>
>
>
>
>
>
This archive was generated by hypermail 2.1.5 : Mon Feb 08 2010 - 04:19:52 CST