Java program analysis by symbolic execution

Program analysis has a long history in computer science. Even when only considering the important aspect of termination analysis, in the past decades an overwhelming number of different techniques has been developed. While the programming languages considered by these approaches initially were more of theoretical importance than of practical use, recently also automated analyses for imperative programming languages like C or Java have been developed. Here, a major challenge is to deal with language constructs and concepts which do not exist in simpler languages. For example, in Java one often uses dynamic dispatch, complex object hierarchies, or side-effects with far-reaching consequences involving the global heap. In this thesis, we present a preprocessing step for Java Bytecode programs in which all such complicated language constructs are handled. This way, subsequent analyses do not need to be concerned with these, and making use of existing techniques is easy. In particular, we show how Symbolic Execution Graphs can be constructed which contain an over-approximation of all possible program runs. This way, and by taking care of having a precise approximation, the information contained in the constructed graphs can, for example, be used to reason about the termination behavior of the original program. Additionally to the construction of such graphs, in this thesis we present a new analysis technique which helps end users identify parts of the analyzed code which are irrelevant for the desired outcome. This way, programming errors causing code to be not executed can be identified and, consequently, fixed by the user. For this technique to be useful, the information contained in the previously constructed graph needs to be precise. We will demonstrate that this is the case. For the techniques presented in this thesis, a rigorous formalization is shown. To comply with the overall goal of, for example, automated termination analysis, we also need to implement the techniques and theoretical results. In this thesis we show how certain hard to automate aspects can be approached, leading to a competitive implementation. The techniques presented in this thesis are implemented in the AProVE tool. As also related techniques working on Symbolic Execution Graphs are implemented in AProVE, with the click of a button users can analyze Java Bytecode programs for (non)termination and find irrelevant code. In the annual International Termination Competition, it is demonstrated that currently AProVE is the most powerful termination analyzer for Java Bytecode programs.

[1]  Elvira Albert,et al.  Termination Analysis of Java Bytecode , 2008, FMOODS.

[2]  Nick Benton,et al.  Semantics of Program Analyses and Transformations , 2016 .

[3]  A. Turing On Computable Numbers, with an Application to the Entscheidungsproblem. , 1937 .

[4]  David Hovemeyer,et al.  Using Static Analysis to Find Bugs , 2008, IEEE Software.

[5]  Patrick Cousot,et al.  Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints , 1977, POPL.

[6]  Rich Hickey,et al.  The Clojure programming language , 2008, DLS '08.

[7]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[8]  Christian von Essen,et al.  Automated Termination Analysis of Java Bytecode by Term Rewriting , 2010, RTA.

[9]  Bor-Yuh Evan Chang,et al.  Calling context abstraction with shapes , 2011, POPL '11.

[10]  Klaus Wehrle,et al.  Support for Error Tolerance in the Real-Time Transport Protocol , 2013, ArXiv.

[11]  Jürgen Giesl,et al.  Automated termination proofs for logic programs by term rewriting , 2008, TOCL.

[12]  Peter W. O'Hearn,et al.  Shape Analysis for Composite Data Structures , 2007, CAV.

[13]  Jürgen Giesl,et al.  Symbolic evaluation graphs and term rewriting: a general methodology for analyzing logic programs , 2012, PPDP.

[14]  Daniel Merschen,et al.  Integration und Analyse von Artefakten in der modellbasierten Entwicklung eingebetteter Software , 2014 .

[15]  Armando Solar-Lezama,et al.  Towards optimization-safe systems: analyzing the impact of undefined behavior , 2013, SOSP.

[16]  Edmund M. Clarke,et al.  Counterexample-guided abstraction refinement , 2003, 10th International Symposium on Temporal Representation and Reasoning, 2003 and Fourth International Conference on Temporal Logic. Proceedings..

[17]  Javier O. Blanco,et al.  A Shape Analysis for Non-linear Data Structures , 2010, SAS.

[18]  Peter W. O'Hearn,et al.  Local Reasoning about Programs that Alter Data Structures , 2001, CSL.

[19]  Dominique Gückel Synthesis of state space generators for model checking microcontroller code , 2015 .

[20]  Noel Rappin,et al.  Jython essentials - rapid scripting in Java , 2002 .

[21]  Frank Yellin,et al.  The Java Virtual Machine Specification , 1996 .

[22]  Jürgen Giesl,et al.  Alternating Runtime and Size Complexity Analysis of Integer Programs , 2014, TACAS.

[23]  Terese Term rewriting systems , 2003, Cambridge tracts in theoretical computer science.

[24]  Marc Brockschmidt,et al.  Automated Termination Analysis for Programs with Pointer Arithmetic , 2014 .

[25]  Robert Glück,et al.  An Algorithm of Generalization in Positive Supercompilation , 1995, ILPS.

[26]  Jürgen Giesl,et al.  Inferring Lower Bounds for Runtime Complexity , 2015, RTA.

[27]  Joost-Pieter Katoen,et al.  Performance Analysis of Computing Servers using Stochastic Petri Nets and Markov Automata , 2013 .

[28]  Bertrand Jeannet,et al.  A relational approach to interprocedural shape analysis , 2004, TOPL.

[29]  Paul King,et al.  Groovy in Action , 2007 .

[30]  Xavier Leroy,et al.  Java bytecode verification : algorithms and formalizations Xavier Leroy INRIA Rocquencourt and Trusted Logic , 2003 .

[31]  Joe D. Warren,et al.  The program dependence graph and its use in optimization , 1987, TOPL.

[32]  Simone Frintrop,et al.  Sequence Level Salient Object Proposals for Generic Object Detection in Video , 2014 .

[33]  Peter Schneider-Kamp,et al.  Static Termination Analysis for Prolog Using Term Rewriting and SAT Solving , 2010, KI - Künstliche Intelligenz.

[34]  Jürgen Giesl,et al.  Proving Termination and Memory Safety for Programs with Pointer Arithmetic , 2014, IJCAR.

[35]  Peter W. O'Hearn,et al.  Compositional Shape Analysis by Means of Bi-Abduction , 2011, JACM.

[36]  Peter W. O'Hearn,et al.  Scalable Shape Analysis for Systems Code , 2008, CAV.

[37]  Chang Liu,et al.  Term rewriting and all that , 2000, SOEN.

[38]  Hongfei Fu,et al.  Verifying probabilistic systems: new algorithms and complexity results , 2014 .

[39]  Dawson R. Engler,et al.  A few billion lines of code later , 2010, Commun. ACM.

[40]  Étienne Payet,et al.  A termination analyzer for Java bytecode based on path-length , 2010, TOPL.

[41]  Sebastian Junges,et al.  On Gröbner Bases in the Context of Satisfiability-Modulo-Theories Solving over the Real Numbers , 2013, CAI.

[42]  Uwe Naumann,et al.  A Discrete Adjoint Model for OpenFOAM , 2013, ICCS.

[43]  Jürgen Giesl,et al.  Automated Detection of Non-termination and NullPointerExceptions for Java Bytecode , 2011, FoVeOOS.

[44]  Tobias Nipkow,et al.  A machine-checked model for a Java-like language, virtual machine, and compiler , 2006, TOPL.

[45]  Jürgen Giesl,et al.  Automated termination proofs for haskell by term rewriting , 2011, TOPL.

[46]  Andreas Podelski,et al.  Summarization for termination: no return! , 2009, Formal Methods Syst. Des..

[47]  James C. King,et al.  Symbolic execution and program testing , 1976, CACM.

[48]  Thomas W. Reps,et al.  Precise interprocedural dataflow analysis via graph reachability , 1995, POPL '95.

[49]  Jürgen Giesl,et al.  Automated termination analysis for logic programs with cut , 2010, Theory Pract. Log. Program..

[50]  Jürgen Giesl,et al.  Proving Termination of Programs Automatically with AProVE , 2014, IJCAR.

[51]  Jürgen Giesl,et al.  Termination of term rewriting using dependency pairs , 2000, Theor. Comput. Sci..

[52]  Klaus Wehrle,et al.  HotBox: Testing Temperature Effects in Sensor Networks , 2014, ArXiv.

[53]  René Thiemann,et al.  The DP framework for proving termination of term rewriting , 2007 .

[54]  U. Naumann,et al.  Algorithmic Differentiation of Numerical Methods : Second-Order Tangent and Adjoint Solvers for Systems of Parametrized Nonlinear Equations , 2014 .

[55]  Christian von Essen,et al.  Termination Graphs for Java Bytecode , 2010, Verification, Induction, Termination Analysis.

[56]  Thomas Noll,et al.  Generating Inductive Predicates for Symbolic Execution of Pointer-Manipulating Programs , 2014, ICGT.

[57]  Salvador Lucas,et al.  Search Techniques for Rational Polynomial Orders , 2008, AISC/MKM/Calculemus.

[58]  Marc Brockschmidt Termination analysis for imperative programs operating on the heap , 2014 .

[59]  Jürgen Giesl,et al.  Modular Termination Proofs of Recursive Java Bytecode Programs by Term Rewriting , 2011, RTA.

[60]  Jürgen Giesl,et al.  Automated Termination Analysis for Haskell: From Term Rewriting to Programming Languages , 2006, RTA.

[61]  Klaus Wehrle,et al.  SensorCloud: Towards the Interdisciplinary Development of a Trustworthy Platform for Globally Interconnected Sensors and Actuators , 2013, Trusted Cloud Computing.

[62]  Corina S. Pasareanu,et al.  Symbolic execution with abstraction , 2008, International Journal on Software Tools for Technology Transfer.

[63]  Ola Bini,et al.  Using JRuby: Bringing Ruby to Java , 2011 .

[64]  Jürgen Giesl,et al.  Automated Termination Proofs for Java Programs with Cyclic Data , 2012, CAV.

[65]  Flemming Nielson,et al.  Principles of Program Analysis , 1999, Springer Berlin Heidelberg.

[66]  Marc Brockschmidt,et al.  Better Termination Proving through Cooperation , 2013, CAV.