Exploiting structure for scalable software verification

Software bugs are expensive. Recent estimates by the US National Institute of Standards and Technology claim that the cost of software bugs to the US economy alone is approximately 60 billion USD annually. As society becomes increasingly software-dependent, bugs also reduce our productivity and threaten our safety and security. Decreasing these direct and indirect costs represents a significant research challenge as well as an opportunity for businesses. Automatic software bug-finding and verification tools have a potential to completely revolutionize the software engineering industry by improving reliability and decreasing development costs. Since software analysis is in general undecidable, automatic tools have to use various abstractions to make the analysis computationally tractable. Abstraction is a double-edged sword: coarse abstractions, in general, yield easier verification, but also less precise results. This thesis focuses on exploiting the structure of software for abstracting away irrelevant behavior. Programmers tend to organize code into objects and functions, which effectively represent natural abstraction boundaries. Humans use such structural abstractions to simplify their mental models of software and for constructing informal explanations of why a piece of code should work. A natural question to ask is: How can automatic bug-finding tools exploit the same natural abstractions? This thesis offers possible answers. More specifically, I present three novel ways to exploit structure at three different steps of the software analysis process. First, I show how symbolic execution can preserve the data-flow dependencies of the original code while constructing compact symbolic representations of programs. Second, I propose 1For details, see [1].

[1]  Alonzo Church,et al.  A note on the Entscheidungsproblem , 1936, Journal of Symbolic Logic.

[2]  Donald W. Loveland,et al.  A machine program for theorem-proving , 2011, CACM.

[3]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[4]  H. H. Guild Some cellular logic arrays for non-restoring binary division , 1970 .

[5]  James C. King,et al.  Symbolic execution and program testing , 1976, CACM.

[6]  Lori A. Clarke,et al.  A System to Generate Test Data and Symbolically Execute Programs , 1976, IEEE Transactions on Software Engineering.

[7]  Patrick Cousot,et al.  Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints , 1977, POPL.

[8]  Greg Nelson,et al.  Simplification by Cooperating Decision Procedures , 1979, TOPL.

[9]  G. S. Tseitin On the Complexity of Derivation in Propositional Calculus , 1983 .

[10]  Robert E. Shostak,et al.  Deciding Combinations of Theories , 1982, JACM.

[11]  Randal E. Bryant,et al.  Graph-Based Algorithms for Boolean Function Manipulation , 1986, IEEE Transactions on Computers.

[12]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[13]  M. Wegman,et al.  Global value numbers and redundant computations , 1988, POPL '88.

[14]  Laura K. Dillon,et al.  Using symbolic execution for verification of Ada tasking programs , 1990, TOPL.

[15]  Edsger W. Dijkstra,et al.  Predicate Calculus and Program Semantics , 1989, Texts and Monographs in Computer Science.

[16]  Arthur B. Maccabe,et al.  The program dependence web: a representation supporting control-, data-, and demand-driven interpretation of imperative languages , 1990, PLDI '90.

[17]  Mark N. Wegman,et al.  Efficiently computing static single assignment form and the control dependence graph , 1991, TOPL.

[18]  William Landi,et al.  Undecidability of static analysis , 1992, LOPL.

[19]  G. Ramalingam,et al.  The undecidability of aliasing , 1994, TOPL.

[20]  Guang R. Gao,et al.  A linear time algorithm for placing φ-nodes , 1995, POPL '95.

[21]  Patrice Godefroid,et al.  Partial-Order Methods for the Verification of Concurrent Systems , 1996, Lecture Notes in Computer Science.

[22]  Raymond Lo,et al.  Effective Representation of Aliases and Indirect Memory Operations in SSA Form , 1996, CC.

[23]  Bjarne Steensgaard,et al.  Points-to analysis in almost linear time , 1996, POPL '96.

[24]  Gerardus Sierksma,et al.  Linear and integer programming - theory and practice , 1999, Pure and applied mathematics.

[25]  Louis Goubin,et al.  Trapdoor one-way permutations and multivariate polynominals , 1997, ICICS.

[26]  Bart Selman,et al.  Problem Structure in the Presence of Perturbations , 1997, AAAI/IAAI.

[27]  Gerard J. Holzmann,et al.  The Model Checker SPIN , 1997, IEEE Trans. Software Eng..

[28]  Steven S. Muchnick,et al.  Advanced Compiler Design and Implementation , 1997 .

[29]  Paul Havlak,et al.  Nesting of reducible and irreducible loops , 1997, TOPL.

[30]  K. Rustan M. Leino,et al.  Extended static checking , 1998, PROCOMET.

[31]  Mikkel Thorup,et al.  All Structured Programs have Small Tree-Width and Good Register Allocation , 1998, Inf. Comput..

[32]  David L. Dill,et al.  A decision procedure for bit-vector arithmetic , 1998, Proceedings 1998 Design and Automation Conference. 35th DAC. (Cat. No.98CH36175).

[33]  Hans van Maaren,et al.  A two phase algorithm for solving a class of hard satissfiability problems , 1998 .

[34]  David A. Schmidt,et al.  Program Analysis as Model Checking of Abstract Interpretations , 1998, SAS.

[35]  Flemming Nielson,et al.  Principles of Program Analysis , 1999, Springer Berlin Heidelberg.

[36]  Toby Walsh,et al.  Morphing: Combining Structure and Randomness , 1999, AAAI/IAAI.

[37]  Joao Marques-Silva,et al.  The Impact of Branching Heuristics in Propositional Satisfiability Algorithms , 1999, EPIA.

[38]  Alan J. Hu,et al.  Automatic formal verification of DSP software , 2000, DAC.

[39]  Dawson R. Engler,et al.  Checking system rules using system-specific, programmer-written compiler extensions , 2000, OSDI.

[40]  Sriram K. Rajamani,et al.  Bebop: A Symbolic Model Checker for Boolean Programs , 2000, SPIN.

[41]  Reinhard Wilhelm,et al.  Shape Analysis , 2000, CC.

[42]  Ofer Shtrichman Tuning SAT Checkers for Bounded Model Checking , 2000, CAV 2000.

[43]  M. Moskewicz,et al.  Chaff: engineering an efficient SAT solver , 2001, Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232).

[44]  David L. Dill,et al.  A decision procedure for an extensional theory of arrays , 2001, Proceedings 16th Annual IEEE Symposium on Logic in Computer Science.

[45]  Cormac Flanagan,et al.  Avoiding exponential explosion: generating compact verification conditions , 2001, POPL '01.

[46]  Andrea De Lucia,et al.  Program slicing: methods and applications , 2001, Proceedings First IEEE International Workshop on Source Code Analysis and Manipulation.

[47]  Carlo Ghezzi,et al.  Using symbolic execution for verifying safety-critical systems , 2001, ESEC/FSE-9.

[48]  Helena Ramalhinho Dias Lourenço,et al.  Iterated Local Search , 2001, Handbook of Metaheuristics.

[49]  Natarajan Shankar,et al.  Deconstructing Shostak , 2001, Proceedings 16th Annual IEEE Symposium on Logic in Computer Science.

[50]  Dawson R. Engler,et al.  Bugs as deviant behavior: a general approach to inferring errors in systems code , 2001, SOSP.

[51]  Armin Biere,et al.  Bounded Model Checking Using Satisfiability Solving , 2001, Formal Methods Syst. Des..

[52]  Patrick Cousot,et al.  Modular Static Program Analysis , 2002, CC.

[53]  Natarajan Shankar,et al.  Combining Shostak Theories , 2002, RTA.

[54]  Calogero G. Zarba,et al.  Combining Decision Procedures , 2002, 10th Anniversary Colloquium of UNU/IIST.

[55]  Dawson R. Engler,et al.  A system and language for building system-specific, static analyses , 2002, PLDI '02.

[56]  J. Saxe,et al.  Extended static checking for Java , 2002, PLDI '02.

[57]  Sriram K. Rajamani,et al.  The SLAM project: debugging system software via static analysis , 2002, POPL '02.

[58]  Philippe Schnoebelen,et al.  The Complexity of Temporal Logic Model Checking , 2002, Advances in Modal Logic.

[59]  Alan J. Hu,et al.  Automatic formal verification for scheduled VLIW code , 2002, LCTES/SCOPES '02.

[60]  G. Ramalingam,et al.  On loops, dominators, and dominance frontiers , 2002, TOPL.

[61]  Henry S. Warren,et al.  Hacker's Delight , 2002 .

[62]  James C. Spall,et al.  Introduction to stochastic search and optimization - estimation, simulation, and control , 2003, Wiley-Interscience series in discrete mathematics and optimization.

[63]  Xinming Ou,et al.  Theorem Proving Using Lazy Proof Explication , 2003, CAV.

[64]  Ohad Shacham,et al.  Tuning the VSIDS decision heuristic for bounded model checking , 2003, Proceedings. 4th International Workshop on Microprocessor Test and Verification - Common Challenges and Solutions.

[65]  Niklas Sörensson,et al.  An Extensible SAT-solver , 2003, SAT.

[66]  Daniel Kroening,et al.  Behavioral consistency of C and Verilog programs using bounded model checking , 2003, Proceedings 2003. Design Automation Conference (IEEE Cat. No.03CH37451).

[67]  Kwang-Ting Cheng,et al.  A signal correlation guided ATPG solver and its applications for solving difficult industrial cases , 2003, DAC '03.

[68]  Inês Lynce,et al.  Heuristic backtracking algorithms for SAT , 2003, Proceedings. 4th International Workshop on Microprocessor Test and Verification - Common Challenges and Solutions.

[69]  C. A. R. Hoare,et al.  The verifying compiler: A grand challenge for computing research , 2003, JACM.

[70]  Patrick Cousot,et al.  A static analyzer for large safety-critical software , 2003, PLDI '03.

[71]  Shaul Markovitch,et al.  Learning to Order BDD Variables in Verification , 2011, J. Artif. Intell. Res..

[72]  Thomas Ball,et al.  A Theory of Predicate-Complete Test Coverage and Generation , 2004, FMCO.

[73]  Barbara G. Ryder,et al.  Precise Call Graphs for C Programs with Function Pointers , 2004, Automated Software Engineering.

[74]  Monica S. Lam,et al.  Cloning-based context-sensitive pointer alias analysis using binary decision diagrams , 2004, PLDI '04.

[75]  Thomas Stützle,et al.  Stochastic Local Search: Foundations & Applications , 2004 .

[76]  Neil Immerman,et al.  The Boundary Between Decidability and Undecidability for Transitive-Closure Logics , 2004, CSL.

[77]  Shuvendu K. Lahiri,et al.  Unbounded system verification using decision procedure and predicate abstraction , 2004 .

[78]  Mark N. Wegman,et al.  Analysis of pointers and structures , 1990, SIGP.

[79]  Jian Zhang Symbolic execution of program paths involving pointer structure variables , 2004 .

[80]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[81]  K. Rustan M. Leino,et al.  Loop Invariants on Demand , 2005, APLAS.

[82]  K. Rustan M. Leino,et al.  Efficient weakest preconditions , 2005, Inf. Process. Lett..

[83]  Emmanuel Zarpas,et al.  Benchmarking SAT Solvers for Bounded Model Checking , 2005, SAT.

[84]  K. Rustan M. Leino,et al.  Weakest-precondition of unstructured programs , 2005, PASTE '05.

[85]  Sanjit A. Seshia,et al.  Adaptive eager boolean encoding for arithmetic reasoning in verification , 2005 .

[86]  D. Babic,et al.  Modular Arithmetic Decision Procedure , 2005 .

[87]  Armin Biere,et al.  Effective Preprocessing in SAT Through Variable and Clause Elimination , 2005, SAT.

[88]  Stephen A. Edwards,et al.  Incremental Algorithms for Inter-procedural Analysis of Safety Properties , 2005, CAV.

[89]  Todd D. Millstein,et al.  Generating error traces from verification-condition counterexamples , 2005, Sci. Comput. Program..

[90]  David Detlefs,et al.  Simplify: a theorem prover for program checking , 2005, JACM.

[91]  Alexander Aiken,et al.  Context- and path-sensitive memory leak detection , 2005, ESEC/FSE-13.

[92]  Cesare Tinelli,et al.  The SMT-LIB Standard: Version 1.2 , 2005 .

[93]  Alan J. Hu,et al.  B-Cubing: New Possibilities for Efficient SAT-Solving , 2006, IEEE Transactions on Computers.

[94]  Sylvain Conchon,et al.  Strategies for combining decision procedures , 2006, Theor. Comput. Sci..

[95]  Manuel Laguna,et al.  Fine-Tuning of Algorithms Using Fractional Experimental Designs and Local Search , 2006, Oper. Res..

[96]  Joe Saur Review of "System Testing with an Attitude: An Approach That Nurtures Front-Loaded (Designed and built in. . . not tested in!) Software Quality by Nathan Petschenik", Dorset House Publishing, 2005, ISBN 0-932633-46-3. , 2006, SOEN.

[97]  Isil Dillig,et al.  Static error detection using semantic inconsistency inference , 2007, PLDI '07.

[98]  Alan J. Hu,et al.  Exploiting Shared Structure in Software Verification Conditions , 2007, Haifa Verification Conference.

[99]  David L. Detlefs,et al.  An Overview of the Extended Static Checking System , 2007 .

[100]  Kevin Leyton-Brown,et al.  : The Design and Analysis of an Algorithm Portfolio for SAT , 2007, CP.

[101]  Ahmed Bouajjani,et al.  Context-Bounded Analysis of Multithreaded Programs with Dynamic Linked Structures , 2007, CAV.

[102]  Nikolaj Bjørner,et al.  Efficient E-Matching for SMT Solvers , 2007, CADE.

[103]  David L. Dill,et al.  A Decision Procedure for Bit-Vectors and Arrays , 2007, CAV.

[104]  Madan Musuvathi,et al.  Iterative context bounding for systematic testing of multithreaded programs , 2007, PLDI '07.

[105]  Alan J. Hu,et al.  Boosting Verification by Automatic Tuning of Decision Procedures , 2007 .

[106]  Frank M. Hutter SPEAR Theorem Prover , 2007 .

[107]  Alan J. Hu,et al.  Structural Abstraction of Software Verification Conditions , 2007, CAV.

[108]  Joël Ouaknine,et al.  Deciding Bit-Vector Arithmetic with Abstraction , 2007, TACAS.

[109]  Alan J. Hu,et al.  Calysto: scalable and precise extended static checking , 2008, ICSE.

[110]  Nikolai Tillmann,et al.  Demand-Driven Compositional Symbolic Execution , 2008, TACAS.

[111]  Joseph Sifakis,et al.  Model checking , 1996, Handbook of Automated Reasoning.