Search-Driven String Constraint Solving for Vulnerability Detection

Constraint solving is an essential technique for detecting vulnerabilities in programs, since it can reason about input sanitization and validation operations performed on user inputs. However, real-world programs typically contain complex string operations that challenge vulnerability detection. State-of-the-art string constraint solvers support only a limited set of string operations and fail when they encounter an unsupported one, this leads to limited effectiveness in finding vulnerabilities. In this paper we propose a search-driven constraint solving technique that complements the support for complex string operations provided by any existing string constraint solver. Our technique uses a hybrid constraint solving procedure based on the Ant Colony Optimization meta-heuristic. The idea is to execute it as a fallback mechanism, only when a solver encounters a constraint containing an operation that it does not support. We have implemented the proposed search-driven constraint solving technique in the ACO-Solver tool, which we have evaluated in the context of injection and XSS vulnerability detection for Java Web applications. We have assessed the benefits and costs of combining the proposed technique with two state-of-the-art constraint solvers (Z3-str2 and CVC4). The experimental results, based on a benchmark with 104 constraints derived from nine realistic Web applications, show that our approach, when combined in a state-of-the-art solver, significantly improves the number of detected vulnerabilities (from 4.7% to 71.9% for Z3-str2, from 85.9% to 100.0% for CVC4), and solves several cases on which the solver fails when used stand-alone (46 more solved cases for Z3-str2, and 11 more for CVC4), while still keeping the execution time affordable in practice.

[1]  Patrick Cousot,et al.  Andromeda: Accurate and Scalable Security Analysis of Web Applications , 2013, FASE.

[2]  Zhendong Su,et al.  Static detection of cross-site scripting vulnerabilities , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[3]  Parosh Aziz Abdulla,et al.  Norn: An SMT Solver for String Constraints , 2015, CAV.

[4]  Gonzalo Navarro,et al.  A guided tour to approximate string matching , 2001, CSUR.

[5]  Toby Walsh,et al.  Handbook of Constraint Programming , 2006, Handbook of Constraint Programming.

[6]  Xiangyu Zhang,et al.  Effective Search-Space Pruning for Solvers of String Equations, Regular Expressions and Length Constraints , 2015, CAV.

[7]  Phil McMinn,et al.  Search-Based Test Input Generation for String Data Types Using the Results of Web Queries , 2012, 2012 IEEE Fifth International Conference on Software Testing, Verification and Validation.

[8]  Thomas Stützle,et al.  MAX-MIN Ant System , 2000, Future Gener. Comput. Syst..

[9]  Yingjun Lyu,et al.  String analysis for Java and Android applications , 2015, ESEC/SIGSOFT FSE.

[10]  Marco Dorigo,et al.  Ant colony optimization , 2006, IEEE Computational Intelligence Magazine.

[11]  Mohammad Alshraideh,et al.  Search‐based software test data generation for string data using program‐specific search operators , 2006, Softw. Test. Verification Reliab..

[12]  Joxan Jaffar,et al.  S3: A Symbolic String Solver for Vulnerability Detection in Web Applications , 2014, CCS.

[13]  Elena Sherman,et al.  Evaluation of string constraint solvers in the context of symbolic execution , 2014, ASE.

[14]  Xiang Fu,et al.  Simple linear string constraints , 2013, Formal Aspects of Computing.

[15]  Alessandro Orso,et al.  WASP: Protecting Web Applications Using Positive Tainting and Syntax-Aware Evaluation , 2008, IEEE Transactions on Software Engineering.

[16]  Willem Visser,et al.  Symbolic execution of programs with strings , 2012, SAICSIT '12.

[17]  Lionel C. Briand,et al.  Security slicing for auditing XML, XPath, and SQL injection vulnerabilities , 2015, 2015 IEEE 26th International Symposium on Software Reliability Engineering (ISSRE).

[18]  Guodong Li,et al.  JST: An automatic test generation tool for industrial Java applications with strings , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[19]  Cesare Tinelli,et al.  A DPLL(T) Theory Solver for a Theory of Strings and Regular Expressions , 2014, CAV.

[20]  Fang Yu,et al.  Stranger: An Automata-Based String Analysis Tool for PHP , 2010, TACAS.

[21]  Zhendong Su,et al.  Sound and precise analysis of web applications for injection vulnerabilities , 2007, PLDI '07.

[22]  Manu Sridharan,et al.  TAJ: effective taint analysis of web applications , 2009, PLDI '09.

[23]  Anders Møller,et al.  Automated detection of client-state manipulation vulnerabilities , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[24]  Tevfik Bultan,et al.  Automata-Based Model Counting for String Constraints , 2015, CAV.

[25]  Michael D. Ernst,et al.  HAMPI: a solver for string constraints , 2009, ISSTA.

[26]  Christopher Krügel,et al.  Saner: Composing Static and Dynamic Analysis to Validate Sanitization in Web Applications , 2008, 2008 IEEE Symposium on Security and Privacy (sp 2008).

[27]  Fang Yu,et al.  Patching vulnerabilities with sanitization synthesis , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[28]  Gul A. Agha,et al.  Solving complex path conditions through heuristic search on induced polytopes , 2014, FSE 2014.

[29]  Jing Xie,et al.  ASIDE: IDE support for web application security , 2011, ACSAC '11.

[30]  Guodong Li,et al.  PASS: String Solving with Parameterized Array and Interval Automaton , 2013, Haifa Verification Conference.

[31]  Ana Milanova,et al.  Type-Based Taint Analysis for Java Web Applications , 2014, FASE.

[32]  Gordon Fraser,et al.  A Memetic Algorithm for whole test suite generation , 2015, J. Syst. Softw..

[33]  Steve Hanna,et al.  A Symbolic Execution Framework for JavaScript , 2010, 2010 IEEE Symposium on Security and Privacy.

[34]  Shen Lin Computer solutions of the traveling salesman problem , 1965 .

[35]  Marco Vieira,et al.  Assessing and Comparing Vulnerability Detection Tools for Web Services: Benchmarking Approach and Examples , 2015, IEEE Transactions on Services Computing.

[36]  Bogdan Korel,et al.  Automated Software Test Data Generation , 1990, IEEE Trans. Software Eng..

[37]  Oscar H. Ibarra,et al.  Automata-based symbolic string analysis for vulnerability detection , 2014, Formal Methods Syst. Des..

[38]  Westley Weimer,et al.  StrSolve: solving string constraints lazily , 2012, Automated Software Engineering.

[39]  Michael D. Ernst,et al.  Automatic creation of SQL Injection and cross-site scripting attacks , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[40]  Fang Yu,et al.  Optimal sanitization synthesis for web application vulnerability repair , 2016, ISSTA.

[41]  Mark Harman,et al.  A Theoretical and Empirical Study of Search-Based Testing: Local, Global, and Hybrid Search , 2010, IEEE Transactions on Software Engineering.

[42]  Andrea Arcuri,et al.  It Does Matter How You Normalise the Branch Distance in Search Based Software Testing , 2010, 2010 Third International Conference on Software Testing, Verification and Validation.

[43]  Benjamin Livshits,et al.  Finding Security Vulnerabilities in Java Applications with Static Analysis , 2005, USENIX Security Symposium.

[44]  Xiangyu Zhang,et al.  Path sensitive static analysis of web applications for remote code execution vulnerability detection , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[45]  Yin Liu,et al.  Practical static analysis for inference of security-related program properties , 2009, 2009 IEEE 17th International Conference on Program Comprehension.