Blind XPath Injection

This paper describes a Blind XPath Injection attack that enables an attacker to extract a complete XML document used for XPath querying without prior knowledge of the XPath query. The attack is “complete” since all possible data is exposed. The attack makes use of two techniques – XPath crawling, and Booleanization of XPath queries. Using this attack, it is possible to get hold of the XML “database” used in the XPath query. This can be most powerful against sites that use XPath queries (and XML “databases”) for authentication, searching, and other uses. Compared to the SQL injection attacks, XPath Injection has the following upsides: • Since XPath is a standard (yet rich) language, it is possible to carry the attack ‘as-is’ for any XPath implementation. This is in contrast to SQL injection where different implementations have different SQL dialects (there is a common SQL language, but it is often too weak). • The XPath language can reference practically all parts of the XML document without access control restrictions, whereas with SQL, a "user" (which is a term undefined in the XPath/XML context) may be restricted to certain tables, columns or queries. So the outcome of the Blind XPath Injection attack is guaranteed to consist of the complete XML document, i.e. the complete database. These results enable an automated attack to fit any XPath based application provided that it possesses the basic security hole. Indeed, such pr oof of concept script was written and demonstrated on various XPath implementations.