RE: HTML5 encodings (was: Re: BOCU patent)

From: Chris Weber (chris@casabasecurity.com)
Date: Mon Dec 21 2009 - 12:10:02 CST

Next message: Charlie Ruland ☘: "Re: Is there a Japanese character for the word Unicode? (from Re: Unicode Haiku Contest)"

Previous message: John H. Jenkins: "Re: Is there a Japanese character for the word Unicode? (from Re: Unicode Haiku Contest)"
In reply to: Doug Ewell: "Re: HTML5 encodings (was: Re: BOCU patent)"
Next in thread: verdy_p: "RE: HTML5 encodings (was: Re: BOCU patent)"
Reply: verdy_p: "RE: HTML5 encodings (was: Re: BOCU patent)"
Reply: Doug Ewell: "Re: HTML5 encodings (was: Re: BOCU patent)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Disagree with this statement. It can be true that security is related to an attacker's ability to influence the auto-discovery of an encoding, but security isn't limited to that scenario.

" The security issue is largely a red herring. Security of HTML encodings
is related to incorrect auto-discovery of encodings, not to using
encodings that have been properly announced."

In the world of Web-apps, most encoding-related security vulnerabilities and exploits come from an attacker's ability to control the charset emitted by the page. In other words, an attacker injects some persistent UTF-7 encoded payload, and then manages to solicit a victim to visit the page where the attacker's payload will render AND the attacker can set the META or HTTP header charset to utf-7. In this case, the browser isn't auto-discovering, it sees UTF-7 as a valid declaration, and the Web-app is blind, just delivering data.

-Chris

-----Original Message-----
From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org] On Behalf Of Doug Ewell
Sent: Monday, December 21, 2009 6:38 AM
To: Unicode Mailing List
Cc: Peter Krefting
Subject: Re: HTML5 encodings (was: Re: BOCU patent)

Peter Krefting <peter at opera dot com> wrote:

>> "User agents must not support the CESU-8, UTF-7, BOCU-1 and SCSU
>> encodings."
>>
>> Amazing, isn't it? So thoughtful of the HTML 5 WG to protect
>> developers' time by prohibiting a handful of selected encodings.
>
> There are some security issues related to these, and they are very
> rarely used on actual web pages, which is why they are on the
> "prohibited" list. Full reasoning behind it can probably be found on
> the HTML5 mailing list, although I do not have a link to share. One of
> the problems is that they are not ASCII based, and theoretically
> something like "<script>" can be encoded in such a way that a naïve
> ASCII-based parser wouldn't find it and filter it away from
> user-submitted input, making it easier to do cross-domain attacks.

SCSU is completely ASCII-based, as long as the text is in single-byte
mode, which would be the case for the entire HTML header, and usually
the entire text when encoding small alphabets. In "Unicode mode," SCSU
is essentially UTF-16BE (with a non-ASCII escape for some private-use
characters), and UTF-16BE is not prohibited.

The security issue is largely a red herring. Security of HTML encodings
is related to incorrect auto-discovery of encodings, not to using
encodings that have been properly announced. Even UTF-7, while
generally undesirable and unnecessary for Web pages, is "secure" if
correctly identified.

Henri Sivonen stated that the main reason for prohibiting encodings was
to avoid "wasting developer time" and focusing attention on support of
new features instead. Apparently he didn't feel developers were capable
of both.

--
Doug Ewell  |  Thornton, Colorado, USA  |  http://www.ewellic.org
RFC 5645, 4645, UTN #14  |  ietf-languages @ http://is.gd/2kf0s

Next message: Charlie Ruland ☘: "Re: Is there a Japanese character for the word Unicode? (from Re: Unicode Haiku Contest)"
Previous message: John H. Jenkins: "Re: Is there a Japanese character for the word Unicode? (from Re: Unicode Haiku Contest)"
In reply to: Doug Ewell: "Re: HTML5 encodings (was: Re: BOCU patent)"
Next in thread: verdy_p: "RE: HTML5 encodings (was: Re: BOCU patent)"
Reply: verdy_p: "RE: HTML5 encodings (was: Re: BOCU patent)"
Reply: Doug Ewell: "Re: HTML5 encodings (was: Re: BOCU patent)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Mon Dec 21 2009 - 12:12:27 CST