The Google Anti-Malware engineering team knows you have many questions related to our scanning and flagging of infected sites, some with short and simple answers and some with more complex answers. The short-answer questions are already -- we hope -- adequately handled on the Webmaster Forums; now we want to do a better job at answering the more complex questions.
To this end, we have created a Google Moderator page for you to submit your questions, and to vote on other webmasters' questions. In two weeks (on Friday the 28th of August), we will close the page and select a few of the top-rated questions. Over the course of the next several weeks, we will do our best to answer each of these in a write-up, to be published here and to the Webmaster Malware Forum.
We hope to repeat this exercise (with a fresh Moderator page) in the fall to give you the opportunity to ask more questions.
Thank you, and see you on the Moderator page!
A recent surge in compromised web servers has generated many interesting discussions in online forums and blogs. We thought we would join the conversation by sharing what we found to be the most popular malware sites in the last two months.As we've discussed previously, we constantly scan our index for potentially dangerous sites. Our automated systems found more than 4,000 different sites that appeared to be set up for distributing malware by massively compromising popular web sites. Of these domains more than 1,400 were hosted in the .cn TLD. Several contained plays on the name of Google such as goooogleadsence.biz, etc.
Building on our earlier posts on defenses against web application flaws ["Automating Web Application Security Testing", "Meet ratproxy, our passive web security assessment tool"], we introduce Automatic Context-Aware Escaping (Auto-Escape for short), a functionality we added to two Google-developed general purpose template systems to better protect against Cross-Site Scripting (XSS).
We developed Auto-Escape specifically for general purpose template systems; that is, template systems that are for the most part unaware of the structure and programming language of the content on which they operate. These template systems typically provide minimal support for web applications, possibly limited to basic escaping functions that a developer can invoke to help escape unsafe content being returned in web responses. Our observation has been that web applications of substantial size and complexity using these template systems have an increased risk of introducing XSS flaws. To see why this is the case, consider the simplified template below in which double curly brackets {{ and }} enclose placeholders (variables) that are replaced with run-time content, presumed unsafe.
{{
}}
<body> <span style="color:{{USER_COLOR}};"> Hello {{USERNAME}}, view your <a href="{{USER_ACCOUNT_URL}}">Account</a>. </span> <script> var id = {{USER_ID}}; // some code using id, say: // alert("Your user ID is: " + id); </script></body>
In this template, four variables are used (not in this order):
javascript:
expression()
url()
Each of these variable insertions requires a different escaping method or risks introducing XSS. To keep the example small, we excluded several contexts of interest, particularly style tags, HTML attributes that expect Javascript (such as onmouseover), and considerations of whether attribute values are enclosed within quotes or not (which also affects escaping).
onmouseover
The example above demonstrates the importance of understanding the precise context in which variables are being inserted and the need for escaping functions that are both safe and correct for each. For larger and complex web applications, we notice two related vectors for XSS:
Considering the sheer number of templates in large web applications and the number of untrusted content they may operate on, the process of proper escaping becomes complicated and error prone. It is also difficult to efficiently audit from a security testing perspective. We developed Auto-Escape to take that complexity away from the developer and into the template system and therefore reduce the risks of XSS that would have ensued.
Auto-Escape is a functionality designed to make the Template System web application context-aware and therefore able to apply automatically and properly the escaping required. This is achieved in three parts:
A simple mechanism is provided for the developer to indicate that some variables are safe and should not be escaped. This is used for variables that are either escaped through other means in source code or contain trusted markup that should be emitted intact.
Auto-Escape has been released with the C++ Google Ctemplate for a while now and it continues to develop there. You can read more about it in the Guide to using Auto-Escape. We also implemented Auto-Escape for the ClearSilver template system and expect it to be released in the near future. Lastly, we are in the process of integrating it into other template systems developed at Google for Java and Python and are interested in working with a few other open source template systems that may benefit from this logic. Our HTML/Javascript parser is already available with the Google Ctemplate distribution and is expected to be released as a stand-alone open source project very soon.
Co-developers: Filipe Almeida and Mugdha Bendre