RE: UTF-8 signature in web and email

From: Richard, Francois M (Francois.M.Richard@usa.xerox.com)
Date: Tue May 15 2001 - 08:11:23 EDT


UTF-8 is considered as a character encoding form as any other...
For UTF-16 only, the BOM is recommended.
See http://www.w3.org/TR/REC-html40/charset.html#h-5.2.1

For character encoding determination (See
http://www.w3.org/TR/REC-html40/charset.html#h-5.2) , priorities are defined
as follow (high first):

1- An HTTP "charset" parameter in a "Content-Type" field.
2- A META declaration with "http-equiv" set to "Content-Type" and a value
set for "charset".
3- The charset attribute set on an element that designates an external
resource.

/Francois

> -----Original Message-----
> From: Roozbeh Pournader [mailto:roozbeh@sharif.edu]
> Sent: Monday, May 14, 2001 7:32 PM
> To: Unicode List; www-i18n@w3.org
> Subject: UTF-8 signature in web and email
>
>
>
> Well, I received a UTF-8 email from Microsoft's Dr
> International today. It
> was a "multipart/alternative", with both the "text/plain" and
> "text/html"
> in UTF-8. Well, nothing interesting yet, but the interesting point was
> that the HTML version had a UTF-8 signature, but the text
> version lacked
> it. So, the HTML version had it three times: mime charset as UTF-8,
> UTF-8 signature, and <meta> charset markup.
>
> Questions:
>
> 1. What are the current recommendations for these?
>
> 2. Most important of all, does W3C allow UTF-8 signatures before
> "<!DOCTYPE>"? And if yes, what should be done if they mismatch the
> charset as can be described in the <meta> tag?
>
> --roozbeh
>
>



This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:18:17 EDT