SoK: XML Parser Vulnerabilities

The Extensible Markup Language (XML) has become a widely used data structure for web services, Single-Sign On, and various desktop applications. The core of the entire XML processing is the XML parser. Attacks on XML parsers, such as the Billion Laughs and the XML External Entity (XXE) Attack are known since 2002. Nevertheless even experienced companies such as Google, and Facebook were recently affected by such vulnerabilities. In this paper we systematically analyze known attacks on XML parsers and deal with challenges and solutions of them. Moreover, as a result of our in-depth analysis we found three novel attacks. We conducted a large-scale analysis of 30 different XML parsers of six different programming languages. We created an evaluation framework that applies different variants of 17 XML parser attacks and executed a total of 1459 attack vectors to provide a valuable insight into a parser's configuration. We found vulnerabilities in 66 % of the default configuration of all tested parses. In addition, we comprehensively inspected parser features to prevent the attacks, show their unexpected side effects, and propose secure configurations.

[1]  Christopher A. Jones,et al.  Python and XML , 2001 .

[2]  Zeki Bayram,et al.  XSLT Version 2.0 Is Turing-Complete: A Purely Transformation Based Proof , 2006, CIAA.

[3]  Lionel C. Briand,et al.  Known XML Vulnerabilities Are Still a Threat to Popular Parsers and Open Source Systems , 2015, 2015 IEEE International Conference on Software Quality, Reliability and Security.