ICE: A Robust Learning Framework for Synthesizing Invariants

Invariant generation lies at the heart of automated program verification, and the learning paradigm for synthesizing invariants is a new promising approach to solve this important problem. Unlike white-box techniques that try to generate an invariant by analyzing the program, learning approaches try to synthesize the invariant given concrete configurations that the invariant must include and exclude, and are algorithmically based on learning theory and scalable machine learning algorithms. In this paper we argue that traditional learning paradigms that use concrete examples and counterexamples are inherently non-robust for synthesizing invariants. We introduce a more general learning paradigm, called ICE-learning, that learns using examples, counter-examples, and implications, and show that this paradigm allows building honest teachers and convergent mechanisms for invariant synthesis. We study the new paradigm of ICE learning, develop several monotonic ICE-learning algorithms, and two classes of non-monotonic domains for learning numerical invariants for scalar variables as well as quantified invariants for arrays and dynamic lists, and establish convergence results for them. We implement these ICE algorithms in a prototype verifier and show that the robustness of ICE-learning is practical and effective by evaluating them on a class of programs.

[1]  Nicolas Halbwachs,et al.  Automatic discovery of linear restraints among variables of a program , 1978, POPL.

[2]  Thomas A. Henzinger,et al.  Lazy abstraction , 2002, POPL '02.

[3]  Corina S. Pasareanu,et al.  Learning Assumptions for Compositional Verification , 2003, TACAS.

[4]  Andreas Podelski,et al.  Abstraction Refinement for Quantified Array Assertions , 2009, SAS.

[5]  Kwangkeun Yi,et al.  Termination Analysis with Algorithmic Learning , 2012, CAV.

[6]  Nikolaj Bjørner,et al.  Z3: An Efficient SMT Solver , 2008, TACAS.

[7]  Benedikt Bollig,et al.  libalf: The Automata Learning Framework , 2010, CAV.

[8]  Sumit Gulwani,et al.  Lifting abstract interpreters to quantified logical domains , 2008, POPL '08.

[9]  Sriram K. Rajamani,et al.  The SLAM project: debugging system software via static analysis , 2002, POPL '02.

[10]  Alexander Aiken,et al.  Verification as Learning Geometric Concepts , 2013, SAS.

[11]  Nicolas Halbwachs,et al.  Discovering properties about arrays in simple programs , 2008, PLDI '08.

[12]  Christof Löding,et al.  Learning Universally Quantified Invariants of Linear Data Structures , 2013, CAV.

[13]  Constantin Enea,et al.  Abstract Domains for Automated Reasoning about List-Manipulating Programs with Infinite Data , 2012, VMCAI.

[14]  K. Rustan M. Leino,et al.  Houdini, an Annotation Assistant for ESC/Java , 2001, FME.

[15]  Sumit Gulwani,et al.  Program analysis as constraint solving , 2008, PLDI '08.

[16]  Sriram Sankaranarayanan,et al.  Static Analysis in Disjunctive Numerical Domains , 2006, SAS.

[17]  Bor-Yuh Evan Chang,et al.  Boogie: A Modular Reusable Verifier for Object-Oriented Programs , 2005, FMCO.

[18]  E. Mark Gold,et al.  Complexity of Automaton Identification from Given Data , 1978, Inf. Control..

[19]  Roberto Bruttomesso,et al.  SAFARI: SMT-Based Abstraction for Arrays with Interpolants , 2012, CAV.

[20]  Mark A. Hillebrand,et al.  VCC: A Practical System for Verifying Concurrent C , 2009, TPHOLs.

[21]  Rupak Majumdar,et al.  From Tests to Proofs , 2009, TACAS.

[22]  Gilberto Filé,et al.  Improving Abstract Interpretations by Systematic Lifting to the Powerset , 1994, ILPS.

[23]  Ranjit Jhala,et al.  Array Abstractions from Proofs , 2007, CAV.

[24]  Kenneth L. McMillan,et al.  Quantified Invariant Generation Using an Interpolating Saturation Prover , 2008, TACAS.

[25]  Thomas A. Henzinger,et al.  SYNERGY: a new algorithm for property checking , 2006, SIGSOFT '06/FSE-14.

[26]  Michael Karr,et al.  Affine relationships among variables of a program , 1976, Acta Informatica.

[27]  Umesh V. Vazirani,et al.  An Introduction to Computational Learning Theory , 1994 .

[28]  Thomas Reps,et al.  PostHat and All That : Attaining Most-Precise Inductive Invariants ⋆ , 2013 .

[29]  Aaron R. Bradley,et al.  SAT-Based Model Checking without Unrolling , 2011, VMCAI.

[30]  Rajeev Alur,et al.  Symbolic Compositional Verification by Learning Assumptions , 2005, CAV.

[31]  Patrick Cousot,et al.  A parametric segmentation functor for fully automatic and scalable array content analysis , 2011, POPL '11.

[32]  C. A. R. HOARE,et al.  An axiomatic basis for computer programming , 1969, CACM.

[33]  Patrick Cousot,et al.  Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints , 1977, POPL.

[34]  Pavol Cerný,et al.  Synthesis of interface specifications for Java classes , 2005, POPL '05.

[35]  Robert W. Floyd,et al.  Assigning meaning to programs , 1967 .

[36]  Dana Angluin,et al.  Learning Regular Sets from Queries and Counterexamples , 1987, Inf. Comput..

[37]  Henny B. Sipma,et al.  What's Decidable About Arrays? , 2006, VMCAI.

[38]  Ashutosh Gupta,et al.  InvGen: An Efficient Invariant Generator , 2009, CAV.

[39]  Dana Angluin Negative results for equivalence queries , 1990, Mach. Learn..

[40]  Ranjit Jhala,et al.  A Practical and Complete Approach to Predicate Refinement , 2006, TACAS.

[41]  Henny B. Sipma,et al.  Linear Invariant Generation Using Non-linear Constraint Solving , 2003, CAV.

[42]  William G. Griswold,et al.  Quickly detecting relevant program invariants , 2000, Proceedings of the 2000 International Conference on Software Engineering. ICSE 2000 the New Millennium.

[43]  Antoine Miné,et al.  The octagon abstract domain , 2001, High. Order Symb. Comput..

[44]  Alexander Aiken,et al.  Interpolants as Classifiers , 2012, CAV.

[45]  Dana Angluin,et al.  Queries and concept learning , 1988, Machine Learning.

[46]  Alexander Aiken,et al.  A Data Driven Approach for Algebraic Loop Invariants , 2013, ESOP.

[47]  Xiaokang Qiu,et al.  Decidable logics combining heap structures and data , 2011, POPL '11.

[48]  Kenneth L. McMillan,et al.  Interpolation and SAT-Based Model Checking , 2003, CAV.

[49]  Deepak Kapur,et al.  Using dynamic analysis to discover polynomial and array invariants , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[50]  Soonho Kong,et al.  Automatically Inferring Quantified Loop Invariants by Algorithmic Learning from Simple Templates , 2010, APLAS.