Retrofitting Code for Content Security Policy

Update: Google notified us that they re-factored the reCAPTCHA JavaScript code to be CSP friendly. We now recommend that you use their hosted code instead of our CSP friendly port.

In a previous blog post we shared how SendSafely uses Content Security Policy to minimize the risk of Cross-Site Scripting, commonly referred to as XSS (if you didn’t catch this post, you can check it out here). While it would have been easiest to design our site to use CSP from the beginning, the initial version of our website grew out of an internal research project and was not so fortunate. As a result, we needed to refactor a lot of our UI code to comply with a strict CSP. Specifically, we needed to get rid of the following two patterns that were fairly pervasive in our code:

Inline scripting. A sound CSP does not allow HTML and JavaScript to co-exist in the same document. Prior to CSP, we had a lot of in-line scripts.
Script code served from the same host. CSP best practices dictate that scripts should only run from a dedicated sub-domain that serves static content. This means that JavaScript not only needs to be in separate files, but also served from a completely different host.

Given the above requirements, we needed to figure out an efficient way to convert our existing UI code. As it turned out, our code followed a few simple patterns. Once we came up with a methodical way to convert each pattern, we had a game plan for moving forward with the site-wide conversion.

Common Code Patterns
When we analyzed HTML our code to see how we were using Javascript, we were broadly able to categorize about 90% of our use cases into two buckets:

Links or tag events that called no-arg functions:
<a href=”javascript:doSomething()”>Do something</a>
Links or tag events that called functions with one or more arguments:
<type=”text” name=”email” onkeyup=”doSomethingElse(arg1, arg2)”>

The first case was simple. For elements that previously called a function on a specific event or on a click, we started by giving them a unique element id. Then, in the JavaScript code that loads from our static domain, we have a routine that always fires and looks specifically for each relevant id and programmatically registers the event on that element. So, for example, the first sample we showed you above would get converted to the following HTML (on our dynamic domain) and JavaScript (loaded from the static domain).

HTML:

<a id=”my-link”>Do something</a>

JavaScript:

var link = document.getElementById(“my-link”);
link.addEventListener(“click”, doSomething, false);

If you use JQuery (like we do) it can be done in a slightly more elegant fashion:

$(“#my-link”).click(function() {
doSomething();
});

The second case is not quite as simple, but still relatively straightforward. The main difference between the first and second case is that we need to pass arguments into the JavaScript function. One of the most widely supported (and earliest adopted) parts of the HTML5 spec across all browsers is the data-* element. It’s supported by all major browsers and has been for some time (http://caniuse.com/#feat=dataset). This allows us to declare data attributes on a given HTML element that can be referenced elsewhere by JavaScript, so they are perfect for holding the values we were previously passing in as function arguments. We use the same technique as before to register the click event, but also include references to the data-* attributes in the function call. So, going back to our example, the second sample we showed you would get converted to the following HTML (on our dynamic domain) and JavaScript (loaded from the static domain).

HTML:

<type=”text” id=”my-email-field” name=”email” data-arg-one=”arg1” data-arg-two=”arg2”>

JavaScript (JQuery):

$(“#my-email-field”).keyup(function() {
doSomethingElse(this.getAttribute(“data-arg-one”), this.getAttribute(“data-arg-two”));
});

Web Workers
Unfortunately not all of our JavaScript was covered by the above two examples. One of the more notable exceptions to this was how to incorporate HTML5 Web Workers into our policy. We use web workers when we encrypt and decrypt files using JavaScript since CPU intensive operations like that would cause the entire browser UI to freeze-up during the process (which can take anywhere from a few seconds to several minutes). As it currently stands, most browsers require that web workers execute from JavaScript on the same domain that the page is loaded from. So, in the case of our website, pages loaded from www.sendsafely.com cannot run a web worker loaded fromstatic.sendsafely.com. This is less than ideal from a security perspective since it requires an exception to our otherwise tight CSP.

In order to minimize the places where this exception is allowed, we defined a slightly looser policy for the two URLS that we use for sending (encrypting) and receiving (decrypting) files. Unlike other URLs on our site, these two pages allow scripts originating from the dynamic server to execute. We still don’t allow in-line scripting, so the exposure on these pages is still somewhat minimal since a separate file still needs to be loaded from the same server. For now it seems we will need to live with this approach until a solution for loading web workers from a separate domain is possible.

Third Party Scripts (reCAPTCHA)
Like many sites, SendSafely uses reCAPTCHA to prevent bots and other automated processes from interacting with certain parts of our application. The reCAPTCHA AJAX API requires us to load certain scripts and images from Google servers (specifically from www.google.com/recaptcha/), which forced us to include www.google.com in our CSP (refer to the previous post to see how we’ve done that). In an ideal world, that would be the only change needed, but life is rarely that simple.

Unfortunately, it doesn’t look like the reCAPTCHA AJAX API plays nicely with CSP since it doesn’t run without the inline-scripts and unsafe-eval directives. Out of all the CSP directives to allow, these two create a huge increase in attack surface since they expose a wide variety of XSS attack variants. To better understand why the reCAPTCHA AJAX API requires these directives, let’s take a closer look at the two steps needed to implement the API (taken fromhttps://developers.google.com/recaptcha/docs/display)

Step 1: Load the API JavaScript from Google

<script type=“text/javascript”
src=“http://www.google.com/recaptcha/api/js/recaptcha_ajax.js”
</script>

Step 2: Display the CAPTCHA using the following code

Recaptcha.create(“your_public_key”,
“element_id”,
{
theme: “red”,
callback: Recaptcha.focus_response_field
}
);

At their surface, both steps seem easy to run with CSP. The problem, however, lies in the contents ofrecaptcha_ajax.js. Specifically, the following three code patterns are present in this file and unless re-factored require inline-scripts and unsafe-eval permissions:

Inline Event Handler Definitions
Inline Script within HREF Attributes
Use of String-to-Code in Function Calls

After some research and initial attempts to (unsuccessfully) contact the reCAPTCHA team at Google, we decided to take a stab at re-factoring some of the code to make it CSP friendly. Refactoring third party code is never ideal, but if we could restrict our changes to just presentation-level code and not touch the code that invokes the server API, we minimize the risk of introducing any breaking changes going forward.

As it turns out, the changes to recaptcha_ajax.js required are very minimal and self-contained in that single JS file. Once updated, all we needed to do was load the re-factored JS file from our server instead of remotely from the Google servers. Let’s take a close look at what was changed.

Inline Event Handler Definitions
Many of the reCAPTCHA HTML elements use in-line handler definitions for the onclick event. In order to comply with CSP, the handler definition must be rewritten in terms of addEventListener as shown below (the “a” function is used to dynamically generate an HTML “a” tag with the specified ID). Very easy.

Before:

a(“recaptcha_whatsthis_btn”).onclick = function () {
Recaptcha.showhelp(); return !1 };

After:

document.getElementById(“recaptcha_whatsthis_btn”).addEventListener(‘click’, function () { Recaptcha.showhelp(); return !1 });

Inline Script within HREF Attributes
reCAPTCHA uses a custom function to dynamically build certain document elements. The last argument for one of these functions (named c) is assigned to the HREF attribute of the element, which in some cases includes JavaScript. For these cases, the function call was modified to remove the last argument, and instead bind the argument value programmatically to the onclick event (using addEventListener as in the previous example).

Before:

c(“recaptcha_reload”, “refresh”, “refresh_btn”, “javascript:Recaptcha.reload();”);

After:

c(“recaptcha_reload”, “refresh”, “refresh_btn”);
document.getElementById(“recaptcha_reload_btn”).addEventListener('click’, function () {Recaptcha.reload()});

Use of String-to-Code in Function Calls
Some JavaScript functions, like eval() for example, allow you to specify a function as input or alternatively let you pass string content that will get treated and executed as code (often referred to as string-to-code). Passing a string argument to any of these functions (eval, setinterval, etc) requires the unsafe-eval directive, which is definitely something we do not want to allow. In this case, as shown below, the code is relatively painless to convert since the string value is not dynamic in nature. This was the simplest change of all.

Before:

Recaptcha.timer_id = setInterval('Recaptcha.reload(“t”);’, a);

After:

Recaptcha.timer_id = setInterval(function(){Recaptcha.reload(“t”)}, a);

By changing those three subtle patterns, we were able to safely run the reCaptcha AJAX API without loosening our CSP. We welcome anyone in the same boat to leverage our re-factored JS code to run reCAPTCHA with CSP on your own site. As mentioned, we attempted to contact the reCAPTCHA team at Google during this effort with no success. Hopefully our changes will one day get reflected in the ReCaptcha AJAX API code.