A Breach in the Same-Origin Policy Induced by Mirroring External Content

Jacob Thompson
Independent Security Evaluators
13 min readOct 19, 2016

--

Authored by Jacob Thompson, a Senior Security Analyst at Independent Security Evaluators.

What do Google Translate, the Internet Archive, and the free anonymization proxy Hide.me have in common? Each of these services mirrors web content on a different origin than where it originated — and this could be a threat to your privacy and security. The weakening of the same-origin proxy induced by doing this is an underappreciated attack vector against web users. Under the right conditions, a third-party malicious site could track your browsing history, steal session cookies, and capture data input to web pages that would otherwise be secure under normal conditions. Read along to understand more about the problem, the risks, and how you can protect yourself.

In certain circumstances developers find the need to write web applications that serve copies of content obtained from other unaffiliated servers. Search engines offer users access to cached versions of web pages accessed by their crawlers. The Internet Archive seeks to preserve copies of web pages even after they are modified or removed. Translation services allow users to submit original and desired languages and an arbitrary URL; the service then fetches the URL at the server-side and returns a translated copy of it to the user’s browser. Proxy services such as Hide.me allow users to submit a URL through a web form to view a web page anonymously without special software or configuration as might be required a more robust solution like Tor.

cookies on java[script]

The common thread binding these services is that they provide what I call “mirroring.” That is, they take HTML content (possibly containing JavaScript) from (possibly multiple) external domains, and serve copies of it from a single common origin. Figure 1 shows the request format to a hypothetical but typical mirroring service. Freely available packages such as CGIProxy and PHProxy make it easy for a server administrator to deploy a mirroring service following this design with minimal work.

http://www.example.com/fetch?url=http:%2f%2fwww.example.org%2f

Figure 1. Web applications provide mirroring if they accept an external URL and return a (possibly modified) copy of its contents to the client.

“Pure” mirroring services aim to serve static content. To service a request, they retrieve a stored copy of the requested URL, possibly modify the page so that any links inside point to the mirroring service, and return the content to the user. A user cannot log in to a web application through such a service, interact with RESTful APIs or other web services, and so on, since these mirrors neither serve dynamic content nor relay requests for such programatically-generated pages to the original server hosting them. The “cached page” services provided by Bing or Google, and the Internet Archive are examples of pure mirroring services.

Proxies, on the other hand, relay arbitrary HTTP requests between the end user’s browser and upstream servers hosting the pages being accessed, acting as a bare-bones HTTP proxy server or VPN. Clearly, these services must rewrite any references to absolute URLs in the page, lest the user inadvertently “escape” from the proxy by following a link¹. So long as the proxy provides the necessary rewriting, there is no reason why users could not log in to and use an authenticated web application if desired, despite the obvious privacy risks of tunneling all traffic through a third party.

The problem introduced by mirroring services that operate as shown in Figure 1 is that they cause the browser to treat all pages accessed through the service as if they were being served from the same origin. As a result, the same-origin policy that normally isolates pages served from different servers does not apply. Whether this is of any security consequence depends on the nature of the page being served, the capabilities and configuration of the mirroring service, and hardening techniques in use on the page being accessed (e.g., the HttpOnly cookie flag). Here are two categories of security compromise that could arise when a user accesses pages through a mirroring service:

  1. Cookie leaks. Suppose a user visits site A through the mirroring service, that this site sets an HTTP cookie, and that the mirroring service supports cookies and passes the cookie to the user’s browser. If the user later visits a page on unaffiliated site B through the mirroring service, the cookie previously set by site A could be exposed to site B, even though the same-origin policy normally prevents this.
  2. Total same-origin policy bypass. In a full proxy scenario, a user may use a mirror service to access an authenticated web application A that relies on the same-origin policy and CSRF protections to protect against unauthorized API calls originating from external sites. If the user later accesses page B through the mirror service, that page could make bidirectional HTTP requests to application A through the proxy, bypassing same-origin protections since the browser sees the two pages as originating from the same server.

I provide three concrete examples covering both of these scenarios below.

A History Leak

Suppose that secret.example.com hosts a page that, for social or other reasons, a user would prefer to be able to visit without this fact being exposed. For example, the site might be subversive or considered blasphemous. The page uses the JavaScript document.cookie object to set a session cookie for some legitimate purpose — possibly, to indicate that the user has agreed to a terms of service dialog or has confirmed that he or she is not a minor. Further, the name, format, or other factors with this cookie are unique enough such that it can easily be identified in the browser’s cookie store from its name and value alone, and its presence means it is highly likely that the user visited the secret.example.com site.

The same-origin policy would normally prevent pages from other servers, such as spy.example.org, from accessing the cookies from other origins. Indeed, as Figure 2 shows, after accessing a page on secret.example.com that sets a cookie using the document.cookie object, viewing a page on spy.example.org that displays all cookies contained in the document.cookie object returns nothing.

Figure 2. The same-origin policy prevents a page on spy.example.org from reading a cookie set by secret.example.com.

Now suppose that instead of accessing the two sites directly, the affected user accesses them through a translation service, instead. I’ve moved the two pages to the public-facing domains demo.securityevaluators.com and demo2.securityevaluators.com rather than secret.example.com and spy.example.org, respectively (for testing purposes), but the same-origin policy still applies². Figure 3 shows how, when both sites are accessed through a translation service rather than directly, pages on demo2.securityevaluators.com can access cookies set by demo.securityevaluators.com that would normally be protected by the same origin policy.

Figure 3. When both sites are accessed through a typical translation service, a page on demo2.securityevaluators.com can access cookies set by demo.securityevaluators.com even though they would normally be isolated by the same-origin policy.

Returning to the earlier terminology, in a real-world attack the page on spy.example.org might scan the current cookies for evidence that the user has accessed secret.example.com, and report that fact back to the server if such a cookie is found. Such “history leaks” are considered a violation of the browser security model, and where they have occurred elsewhere, have been treated as security bugs and fixed (e.g., as in CSS selectors). Because the attack shown here relies on JavaScript’s document.cookie object, rather than HTTP headers, this scenario could arise in any mirroring service, not just those that actively proxy traffic in real time.

A Session Cookie Leak

Now, let’s consider a more consequential attack, in which a user’s (possibly-authenticated) session cookie is leaked between two servers. Unlike the history leak, this attack only works against mirroring services that actively relay traffic between the browser and server, so I concentrate on the Hide.me anonymization service.

Suppose that a user (directly and not through a mirroring service) visits and logs in to a web application on demo.securityevaluators.com, and then visits two attack pages on demo2.securityevaluators.com. One attack page attempts to read the cookie previously set by the web application in the HTTP request headers, while another attempts to obtain it using JavaScript. As shown in Figure 4, the same-origin policy blocks both attacks when a user accesses these two domains directly.

Figure 4. The same origin policy blocks pages on demo2.securityevaluators.com from receiving cookies for demo.securityevaluators.com.

Now suppose that instead of accessing the two sites directly, the affected user accesses them through the Hide.me proxy service, instead. Figure 5 shows three windows, all containing pages accessed through Hide.me. The top window shows an authenticated session in the web application. The middle window shows an attack page (served from a different domain) that gathers cookies from HTTP request headers (i.e., it executes at the server-side). The bottom window shows an attack page that gathers cookies from the JavaScript document.cookie object.

Figure 5. When legitimate pages on demo.securityevaluators.com and attack pages on demo2.securityevaluators.com are both accessed through a free proxy service that operates as a mirror, the browser fails to enforce the same-origin policy, allowing the attack page at the bottom of the image to access session cookies for other domains.

The presence of the PHPSESSID cookie on the latter attack page shows that mirror services can undermine the same-origin policy; a real attack page would send this session cookie to the attacker, who could then impersonate the user in the web application. This example is not intended to pick on Hide.me — the same attack applies to any proxy service that follows this design. In fact, I found a public instance of PHProxy to be susceptible in the same way.

While the example shown here requires that a user actively authenticate through the proxy service, this is not necessarily a prerequisite for this type of attack to be security-relevant. For example, suppose a web forum application tracks a guest user’s read and unread posts using an anonymous-but-unique session tied to an HTTP cookie. Were this cookie to be exposed to a third party, that adversary could replay it to the server to view the victim’s post reading history.

Sign up to get our latest blogs

A Same-Origin Policy Bypass

The consequences of a browser’s inability to distinguish pages’ origins when they are accessed through a mirror-based web proxy service are much more than the exposure of user history and even sessions. In fact, because browsers allow bidirectional AJAX requests between resources that fall under the same origin, when both an attack page and the site it targets are accessed through a mirror service that operates as shown in Figure 1, the attack page can interact with its target server in arbitrary ways.

Here is an example. Suppose a user accesses a web application on demo.securityevaluators.com that, among other things, allows the user to view and modify stored credit card information³. Then, the user accesses an attack page hosted on demo2.securityevaluators.com that attempts to retrieve the page containing the user’s credit card number through an AJAX request and display the retrieved source on the screen. As shown in Figure 6, the same-origin policy prevents this if both sites are accessed directly. In fact, the browser even logs a same-origin-related error to the browser console.

Figure 6. The same-origin policy, by default, blocks an attack site (bottom) from receiving the response to an AJAX request to a different origin (top).

Now suppose that instead of accessing the two sites directly, the affected user accesses them through the Hide.me proxy service, instead. Figure 7 shows that this time, the attack page can successfully send an AJAX request to the application and receive the user’s credit card number. Because this attack page is specifically written to allow Hide.me to rewrite the URL pointing to the credit card page, Hide.me rewrites it, causing the browser to treat the attack page and credit card page to belong to the same origin (proxy-nl.hide.me).

Figure 7. When accessing sites through a typical mirror-based proxy service, the browser’s same-origin policy fails to protect against cross-domain AJAX requests and responses, since all pages appear to the browser to originate from the same domain.

Affected Sites

Here is a list of sites where I tested and confirmed that pages hosted on one domain can use JavaScript’s document.cookie to access other cookies set through document.cookie from pages hosted on a different domain.

Solving the Problem

Clearly, mirror-based proxies can undermine a user’s privacy rather than enhancing it. But the designers of these services are not unaware of these risks, and privacy-targeted proxy services including Hide.me include some optional features designed to enhance privacy. In fact, the author of CGIProxy even has a security page listing the software’s limitations and recommending how users should cautiously use a proxy, although that page is silent on the threat that mirror-style proxies present to same-origin-based security. I address some security-enhancing features in typical proxies, their effectiveness, and their limitations below.

  1. Cookie blocking. Hide.me supports an “allow cookies” check box that blocks servers from setting cookies using HTTP response headers. Unfortunately, this provides neither usability nor security. First, blocking HTTP cookies prevents users from using web applications through the proxy that depend on cookies to track session state (although accessing security-sensitive pages through such a proxy is unwise anyway). Second, the “allow cookies” option does not affect cookies set using JavaScript’s document.cookie object rather than HTTP headers, and it would be all but impossible to design a proxy that could scan for and remove code that attempts to access the document.cookie object (theoretical computer scientists could prove using Rice’s theorem that problems like these are undecidable). Therefore, even with cookie blocking in place, history leaks, cookies leaks, and same-origin policy bypasses all remain possible in some instances.
  2. Script and object removal. Hide.me has an option to strip JavaScript code and objects (e.g., plugins and Java applets) from pages accessed through the proxy. Unfortunately, JavaScript is all-but-essential to modern websites, as Mozilla’s removal of the option to disable JavaScript in the Firefox user interface attests. To strip JavaScript from pages delivered through a proxy service would render it unusable for most sites.
  3. Clearing or resetting cookies. An interesting observation I made observing the Google search engine’s cached page feature is that the server’s response automatically resets to the empty string and clears any cookies sent with the request. This is somewhat similar to the option to block cookies entirely, with the possible advantage that it can also remove cookies set by the JavaScript document.cookie object. Nevertheless, a full proxy service could not do this without breaking legitimate functionality and it is thus not a comprehensive solution.
  4. One-to-one hostname-to-subdomain mapping. Another proxy service I tested, Whoer.net, maps hostnames in URLs to unique proxy server subdomains using some form of mapping, (e.g., a hash), rather than servicing all requests from the same domain as shown in Figure 1. This is, perhaps, the most effective way to maintain the same-origin policy while providing a web based proxy, and it does thwart the three specific attacks I show above. Still, serving trusted and untrusted content from the same second-level domain is inherently dangerous — this is why Google hosts untrusted content on googleusercontent.com to separate it from authenticated and trusted content on google.com. For example, consider JavaScript code that programmatically calculates the document.domain value by stripping all but the top and second-level domains of the current page. Such code would malfunction, and expose cookies to external and untrusted sites, when accessed through the proxy.

Web developers may wonder if there are any steps they can take to harden their sites from the attacks I have shown in the event that their end users access the site from an affected proxy. Here are some possible mitigations, but their effectiveness varies.

  1. The HttpOnly cookie flag. The HttpOnly flag blocks a cookie from being accessed through JavaScript or by plugins, that is, it causes the browser to send the cookie back to the server as part of the headers of any request but otherwise prevents access to the cookie in any way. As part of best practices, all web applications should use the HttpOnly flag wherever possible. This can block some, but not all types of attacks, in particular, it would block the history and cookie leaks I demonstrated, but could not prevent the same-origin policy bypass.
  2. Authorization headers in lieu of cookies. Many RESTful APIs use a custom Authorization request header rather than cookies for authentication. This is sufficient to mitigate the issues I have shown (since the authorization token does not accidentally “leak” to other pages just by virtue of belonging to the same origin like a cookie would), although the header is not a substitute for cookie-based authentication in all cases for usability reasons.
  3. Hardening headers. Correctly deploying HTTP Strict Transport Security and public key pinning headers could block a site from being accessed through a mirror-based proxy service (since the pinned key would necessarily mismatch the mirror’s key), but these would be ineffective against a proxy service that strips these headers for functional reasons.
  4. Blocking proxy sites. Highly security-conscious sites may choose to block access through mirror-based proxies as the security of these proxies is dubious even without considering the attacks I have shown.

Finally, end users may wonder how to defend against these issues. First, they should avoid accessing any authenticated applications or other sensitive material through a proxy that operates as shown in Figure 1. Users who wish to browse anonymously should use Tor or another proxy system that operates at the SOCKS or raw HTTP protocol level, instead. Second, there is not much end users can do about the potential “history leak” problem when accessing static pages through the Internet Archive, translation services, or search engines’ cached page feature — so consider visiting such pages in an isolated private browsing mode session and close it immediately after use to minimize the risk.

Here I have demonstrated that mirroring external HTML and JavaScript content under a common subdomain is a risky proposition. Given the fact that the risks are to the sites being mirrored and end-users, and not the mirroring services themselves, it is possible that this issue has not received the attention it deserves. Admittedly, the security issues I’ve demonstrated arise only when the exact circumstances are met — not only in how the affected site makes use of web technologies, but also in the requirement that the victim user actually make use of the mirror service in question. Still, in the age of increasing global surveillance and spearphishing (consider the use of a one million dollar iOS zero day exploit against a UAE-based human rights activist), even special-case and farfetched violations of web security, at first glance, are worthy of investigation and mitigation.

¹ Here I ignore the difficulty of applying such rewriting to AJAX requests and other dynamically-generated requests that comprise modern websites.

² In fact, where two pages share a common parent domain it is possible to relax the same-origin policy via JavaScript’s document.domain or via the domain cookie attribute, but I am not doing that here.

³ Of course, in the real world this page would be served by HTTPS rather than HTTP, and would not “echo back” a stored credit card number.

--

--