论文信息 - Learning Invariants using Decision Trees

Learning Invariants using Decision Trees

The problem of inferring an inductive invariant for verifying program safety can be formulated in terms of binary classification. This is a standard problem in machine learning: given a sample of good and bad points, one is asked to find a classifier that generalizes from the sample and separates the two sets. Here, the good points are the reachable states of the program, and the bad points are those that reach a safety property violation. Thus, a learned classifier is a candidate invariant. In this paper, we propose a new algorithm that uses decision trees to learn candidate invariants in the form of arbitrary Boolean combinations of numerical inequalities. We have used our algorithm to verify C programs taken from the literature. The algorithm is able to infer safe invariants for a range of challenging benchmarks and compares favorably to other ML-based invariant inference techniques. In particular, it scales well to large sample sets.

Thomas Wies | Siddharth Krishna | Christian Puhrsch

[1] J. Ross Quinlan,et al. C4.5: Programs for Machine Learning , 1992 .

[2] Thomas Ball,et al. Testing, abstraction, theorem proving: better together! , 2006, ISSTA '06.

[3] Alexander Aiken,et al. From invariant checking to invariant inference using randomized search , 2014, Formal Methods Syst. Des..

[4] Nikolaj Bjørner,et al. Z3: An Efficient SMT Solver , 2008, TACAS.

[5] Ronald L. Rivest,et al. Constructing Optimal Binary Decision Trees is NP-Complete , 1976, Inf. Process. Lett..

[6] Antoine Miné,et al. The octagon abstract domain , 2001, High. Order Symb. Comput..

[7] Edmund M. Clarke,et al. Counterexample-guided abstraction refinement , 2003, 10th International Symposium on Temporal Representation and Reasoning, 2003 and Fourth International Conference on Temporal Logic. Proceedings..

[8] Marsha Chechik,et al. UFO: Verification with Interpolants and Abstract Interpretation - (Competition Contribution) , 2013, TACAS.

[9] Thomas A. Henzinger,et al. SYNERGY: a new algorithm for property checking , 2006, SIGSOFT '06/FSE-14.

[10] Ameet Talwalkar,et al. Foundations of Machine Learning , 2012, Adaptive computation and machine learning.

[11] David Haussler,et al. Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.

[12] Bor-Yuh Evan Chang,et al. Boogie: A Modular Reusable Verifier for Object-Oriented Programs , 2005, FMCO.

[13] Leo Breiman,et al. Classification and Regression Trees , 1984 .

[14] Marsha Chechik,et al. Craig Interpretation , 2012, SAS.

[15] Leslie G. Valiant,et al. A theory of the learnable , 1984, CACM.

[16] Hassen Saïdi,et al. Construction of Abstract State Graphs with PVS , 1997, CAV.

[17] Alexander Aiken,et al. Verification as Learning Geometric Concepts , 2013, SAS.

[18] Ian T. Jolliffe,et al. Principal Component Analysis , 2002, International Encyclopedia of Statistical Science.

[19] Dirk Beyer,et al. CPAchecker: A Tool for Configurable Software Verification , 2009, CAV.

[20] Christof Löding,et al. ICE: A Robust Framework for Learning Invariants , 2014, CAV.

[21] John P. Gallagher,et al. Tree Automata-Based Refinement with Application to Horn Clause Verification , 2015, VMCAI.

[22] Alexander Aiken,et al. Interpolants as Classifiers , 2012, CAV.

[23] Aiko M. Hormann,et al. Programs for Machine Learning. Part I , 1962, Inf. Control..

[24] Andreas Podelski,et al. Boolean and Cartesian abstraction for model checking C programs , 2001, International Journal on Software Tools for Technology Transfer.

[25] Sriram K. Rajamani,et al. Compositional may-must program analysis: unleashing the power of alternation , 2010, POPL '10.

[26] Patrick Cousot,et al. Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints , 1977, POPL.

[27] Patrick Cousot,et al. Systematic design of program analysis frameworks , 1979, POPL.

[28] Norbert Sauer,et al. On the Density of Families of Sets , 1972, J. Comb. Theory A.

[29] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[30] Swarat Chaudhuri,et al. Dynamic inference of likely data preconditions over predicates by tree learning , 2008, ISSTA '08.

[31] Ashutosh Gupta,et al. InvGen: An Efficient Invariant Generator , 2009, CAV.

[32] Andreas Podelski,et al. Counterexample-guided focus , 2010, POPL '10.

[33] Isil Dillig,et al. Inductive invariant generation via abductive inference , 2013, OOPSLA.

[34] Patrick Cousot,et al. The ASTREÉ Analyzer , 2005, ESOP.