Path- and index-sensitive string analysis based on monadic second-order logic

We propose a novel technique for statically verifying the strings generated by a program. The verification is conducted by encoding the program in Monadic Second-Order Logic (M2L). We use M2L to describe constraints among program variables and to abstract built-in string operations. Once we encode a program in M2L, a theorem prover for M2L, such as MONA, can automatically check if a string generated by the program satisfies a given specification, and if not, exhibit a counterexample. With this approach, we can naturally encode relationships among strings, accounting also for cases in which a program manipulates strings using indices. In addition, our string analysis is path sensitive in that it accounts for the effects of string and Boolean comparisons, as well as regular-expression matches. We have implemented our string-analysis algorithm, and used it to augment an industrial security analysis for Web applications by automatically detecting and verifying sanitizers---methods that eliminate malicious patterns from untrusted strings, making those strings safe to use in security-sensitive operations. On the 8 benchmarks we analyzed, our string analyzer discovered 128 previously unknown sanitizers, compared to 71 sanitizers detected by a previously presented string analysis.

[1]  Barbara G. Ryder,et al.  Modular string-sensitive permission analysis with demand-driven precision , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[2]  Michael D. Ernst,et al.  Automatic creation of SQL Injection and cross-site scripting attacks , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[3]  Gregor Snelting,et al.  Combining Slicing and Constraint Solving for Validation of Measurement Software , 1996, SAS.

[4]  Hao Wang,et al.  Creating Vulnerability Signatures Using Weakest Preconditions , 2007, 20th IEEE Computer Security Foundations Symposium (CSF'07).

[5]  Mark N. Wegman,et al.  Constant propagation with conditional branches , 1985, POPL.

[6]  Westley Weimer,et al.  A decision procedure for subset constraints over regular languages , 2009, PLDI '09.

[7]  Aske Simon Christensen,et al.  Precise Analysis of String Expressions , 2003, SAS.

[8]  Zhendong Su,et al.  Sound and precise analysis of web applications for injection vulnerabilities , 2007, PLDI '07.

[9]  Yasuhiko Minamide,et al.  Static approximation of dynamically generated Web pages , 2005, WWW '05.

[10]  Patrick Maier Deciding Extensions of the Theories of Vectors and Bags , 2009, VMCAI.

[11]  Oscar H. Ibarra,et al.  Symbolic String Verification: An Automata-Based Approach , 2008, SPIN.

[12]  Elisabetta Di Nitto,et al.  Proceedings of the IEEE/ACM international conference on Automated software engineering , 2010, ASE 2010.

[13]  David A. Basin,et al.  Bounded Model Construction for Monadic Second-Order Logics , 2000, CAV.

[14]  Hiroshi Inamura,et al.  Dynamic test input generation for web applications , 2008, ISSTA '08.

[15]  Nils Klarlund,et al.  MONA Version 1.4 - User Manual , 2001 .

[16]  Michael D. Ernst,et al.  HAMPI: A String Solver for Testing, Analysis and Vulnerability Detection , 2011, CAV.

[17]  Tao Xie,et al.  Locating need-to-translate constant strings for software internationalization , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[18]  M. Wegman,et al.  Global value numbers and redundant computations , 1988, POPL '88.

[19]  Martin Kay,et al.  Regular Models of Phonological Rule Systems , 1994, CL.

[20]  David Grove,et al.  A framework for call graph construction algorithms , 2001, TOPL.

[21]  Monica S. Lam,et al.  Cloning-based context-sensitive pointer alias analysis using binary decision diagrams , 2004, PLDI '04.

[22]  Benjamin Livshits,et al.  Merlin: specification inference for explicit information flow problems , 2009, PLDI '09.

[23]  Mark N. Wegman,et al.  Efficiently computing static single assignment form and the control dependence graph , 1991, TOPL.

[24]  Benjamin Livshits,et al.  Fast and Precise Sanitizer Analysis with BEK , 2011, USENIX Security Symposium.

[25]  Nikolaj Bjørner,et al.  Symbolic finite state transducers: algorithms and applications , 2012, POPL '12.

[26]  D. Shannon,et al.  Efficient symbolic execution of strings for validating web applications , 2009, DEFECTS '09.

[27]  Joost Engelfriet,et al.  MSO definable string transductions and two-way finite-state transducers , 1999, TOCL.

[28]  Jonathan M. Smith,et al.  USENIX Association , 2000 .

[29]  Nikolai Tillmann,et al.  Pex-White Box Test Generation for .NET , 2008, TAP.

[30]  Christopher Krügel,et al.  Saner: Composing Static and Dynamic Analysis to Validate Sanitization in Web Applications , 2008, 2008 IEEE Symposium on Security and Privacy (sp 2008).

[31]  Manu Sridharan,et al.  TAJ: effective taint analysis of web applications , 2009, PLDI '09.

[32]  Patrick Cousot,et al.  Formal language, grammar and set-constraint-based program analysis by abstract interpretation , 1995, FPCA '95.

[33]  Michael D. Ernst,et al.  HAMPI: a solver for string constraints , 2009, ISSTA.

[34]  Thomas W. Reps,et al.  Program analysis via graph reachability , 1997, Inf. Softw. Technol..

[35]  Nils Klarlund,et al.  Mona: Monadic Second-Order Logic in Practice , 1995, TACAS.

[36]  Xiang Fu,et al.  A String Constraint Solver for Detecting Web Application Vulnerability , 2010, SEKE.

[37]  Marco Pistoia,et al.  Path- and index-sensitive string analysis based on monadic second-order logic , 2013 .

[38]  David Grove,et al.  Call graph construction in object-oriented languages , 1997, OOPSLA '97.

[39]  Gregor Snelting,et al.  Static path conditions for Java , 2008, PLAS '08.

[40]  Steve Hanna,et al.  A Symbolic Execution Framework for JavaScript , 2010, 2010 IEEE Symposium on Security and Privacy.

[41]  Fang Yu,et al.  Generating Vulnerability Signatures for String Manipulating Programs Using Automata-Based Forward and Backward Symbolic Analyses , 2009, 2009 IEEE/ACM International Conference on Automated Software Engineering.

[42]  Margus Veanes,et al.  Rex: Symbolic Regular Expression Explorer , 2010, 2010 Third International Conference on Software Testing, Verification and Validation.

[43]  Benjamin Livshits,et al.  Finding Security Vulnerabilities in Java Applications with Static Analysis , 2005, USENIX Security Symposium.

[44]  Nikolaj Bjørner,et al.  Path Feasibility Analysis for String-Manipulating Programs , 2009, TACAS.