Security Blog
The latest news and insights from Google on security and safety on the Internet
Content hosting for the modern web
August 29, 2012
Posted by Michal Zalewski, Security Team
Our applications host a variety of web content on behalf of our users, and over the years we learned that even something as simple as serving a profile image can be surprisingly fraught with pitfalls. Today, we wanted to share some of our findings about content hosting, along with the approaches we developed to mitigate the risks.
Historically, all browsers and browser plugins were designed simply to excel at displaying several common types of web content, and to be tolerant of any mistakes made by website owners. In the days of static HTML and simple web applications, giving the owner of the domain authoritative control over how the content is displayed wasn’t of any importance.
It wasn’t until the mid-2000s that we started to notice a problem: a clever attacker could manipulate the browser into interpreting seemingly harmless images or text documents as HTML, Java, or Flash—thus gaining the ability to execute malicious scripts in the
security context
of the application displaying these documents (essentially, a
cross-site scripting flaw
). For all the increasingly sensitive web applications, this was very bad news.
During the past few years, modern browsers began to improve. For example, the browser vendors limited the amount of second-guessing performed on text documents, certain types of images, and unknown MIME types. However, there are many standards-enshrined design decisions—such as ignoring MIME information on any content loaded through
<object>
,
<embed>
, or
<applet>
—that are much more difficult to fix; these practices may lead to vulnerabilities similar to the
GIFAR bug
.
Google’s security team played an active role in investigating and remediating many content sniffing vulnerabilities during this period. In fact, many of the enforcement proposals were first prototyped in Chrome. Even still, the overall progress is slow; for every resolved problem, researchers discover a previously unknown flaw in another browser mechanism. Two recent examples are the Byte Order Mark (BOM) vulnerability reported to us by Masato Kinugawa, or the
MHTML attacks
that we have seen happening in the wild.
For a while, we focused on content sanitization as a possible workaround - but in many cases, we found it to be insufficient. For example, Aleksandr Dobkin managed to construct a purely alphanumeric Flash applet, and in our internal work the Google security team created images that can be forced to include a particular plaintext string in their body, after being scrubbed and recoded in a deterministic way.
In the end, we reacted to this raft of content hosting problems by placing some of the high-risk content in separate, isolated
web origins
—most commonly
*.googleusercontent.com
. There, the “sandboxed” files pose virtually no threat to the applications themselves, or to google.com authentication cookies. For public content, that’s all we need: we may use random or user-specific subdomains, depending on the degree of isolation required between unrelated documents, but otherwise the solution just works.
The situation gets more interesting for non-public documents, however. Copying users’ normal authentication cookies to the “sandbox” domain would defeat the purpose. The natural alternative is to move the secret token used to confer access rights from the
Cookie
header to a value embedded in the URL, and make the token unique to every document instead of keeping it global.
While this solution eliminates many of the
significant design flaws
associated with HTTP cookies, it trades one imperfect authentication mechanism for another. In particular, it’s important to note there are more ways to accidentally leak a capability-bearing URL than there are to accidentally leak cookies; the most notable risk is disclosure through the
Referer
header for any document format capable of including external subresources or of linking to external sites.
In our applications, we take a risk-based approach. Generally speaking, we tend to use three strategies:
In higher risk situations (e.g. documents with elevated risk of URL disclosure), we may couple the URL token scheme with short-lived, document-specific cookies issued for specific subdomains of
googleusercontent.com
. This mechanism, known within Google as
FileComp
, relies on a range of attack mitigation strategies that are too disruptive for Google applications at large, but work well in this highly constrained use case.
In cases where the risk of leaks is limited but responsive access controls are preferable (e.g., embedded images), we may issue URLs bound to a specific user, or ones that expire quickly.
In low-risk scenarios, where usability requirements necessitate a more balanced approach, we may opt for globally valid, longer-lived URLs.
Of course, the research into the security of web browsers continues, and the landscape of web applications is evolving rapidly. We are constantly tweaking our solutions to protect Google users even better, and even the solutions described here may change. Our commitment to making the Internet a safer place, however, will never waver.
Labels
#sharethemicincyber
#supplychain #security #opensource
android
android security
android tr
app security
big data
biometrics
blackhat
C++
chrome
chrome enterprise
chrome security
connected devices
CTF
diversity
encryption
federated learning
fuzzing
Gboard
google play
google play protect
hacking
interoperability
iot security
kubernetes
linux kernel
memory safety
Open Source
pha family highlights
pixel
privacy
private compute core
Rowhammer
rust
Security
security rewards program
sigstore
spyware
supply chain
targeted spyware
tensor
Titan M2
VDP
vulnerabilities
workshop
Archive
2025
Jan
2024
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2023
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2022
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2021
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2020
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2019
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2018
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2017
Dec
Nov
Oct
Sep
Jul
Jun
May
Apr
Mar
Feb
Jan
2016
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2015
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2014
Dec
Nov
Oct
Sep
Aug
Jul
Jun
Apr
Mar
Feb
Jan
2013
Dec
Nov
Oct
Aug
Jun
May
Apr
Mar
Feb
Jan
2012
Dec
Sep
Aug
Jun
May
Apr
Mar
Feb
Jan
2011
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
2010
Nov
Oct
Sep
Aug
Jul
May
Apr
Mar
2009
Nov
Oct
Aug
Jul
Jun
Mar
2008
Dec
Nov
Oct
Aug
Jul
May
Feb
2007
Nov
Oct
Sep
Jul
Jun
May
Feed
Follow @google
Follow
Give us feedback in our
Product Forums
.