From: Addison Phillips (addison.phillips@quest.com)
Date: Thu Mar 03 2005 - 23:18:36 CST
Unfortunately, the default charset for text/* is... wait for it... US-ASCII. I quote RFC 2046:
A critical parameter that may be specified in the Content-Type field
for "text/plain" data is the character set. This is specified with a
"charset" parameter, as in:
Content-type: text/plain; charset=iso-8859-1
Unlike some other parameter values, the values of the charset
parameter are NOT case sensitive. The default character set, which
must be assumed in the absence of a charset parameter, is US-ASCII.
If you use text/* types, you need to declare your charset.
Addison
Addison P. Phillips
Globalization Architect, Quest Software
http://www.quest.com
Chair, Internationalization Core Working Group
http://www.w3.org/International
Internationalization is not a feature.
It is an architecture.
> -----Original Message-----
> From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org] On
> Behalf Of Doug Ewell
> Sent: jeudi 3 mars 2005 20:37
> To: Unicode Mailing List
> Cc: Elliotte Harold
> Subject: Re: Bad Content-type headers on Unicode web site?
>
> Elliotte Harold <elharo at metalab dot unc dot edu> wrote:
>
> > The URL
> >
> > http://www.unicode.org/Public/UNIDATA/NormalizationTest.txt
> >
> > appears to be served as type text/plain with no charset parameter...
> >
> > However, that file's contents seem to be UTF-8. Shouldn't this be
> > changed to
> >
> > Content-Type: text/plain; charset=utf-8
> >
> > or some such?
>
> What is the "default" encoding for text/plain with no charset parameter?
> Is there one?
>
> I like to think UTF-8 is "plain text" as much as Latin-1 or anything
> else.
>
> -Doug Ewell
> Fullerton, California
> http://users.adelphia.net/~dewell/
>
>
This archive was generated by hypermail 2.1.5 : Thu Mar 03 2005 - 23:19:19 CST