Learning Shape Analysis

We present a data-driven verification framework to automatically prove memory safety of heap-manipulating programs. Our core contribution is a novel statistical machine learning technique that maps observed program states to (possibly disjunctive) separation logic formulas describing the invariant shape of (possibly nested) data structures at relevant program locations. We then attempt to verify these predictions using a program verifier, where counterexamples to a predicted invariant are used as additional input to the shape predictor in a refinement loop. We have implemented our techniques in Locust, an extension of the GRASShopper verification tool. Locust is able to automatically prove memory safety of implementations of classical heap-manipulating programs such as insertionsort, quicksort and traversals of nested data structures.

[1]  Constantin Enea,et al.  Abstract Domains for Automated Reasoning about List-Manipulating Programs with Infinite Data , 2012, VMCAI.

[2]  Alexander Aiken,et al.  Interpolants as Classifiers , 2012, CAV.

[3]  Hongseok Yang,et al.  Compositional Shape Analysis , 2008 .

[4]  Alexander Aiken,et al.  Verification as Learning Geometric Concepts , 2013, SAS.

[5]  Aaron R. Bradley,et al.  SAT-Based Model Checking without Unrolling , 2011, VMCAI.

[6]  Alexey Gotsman,et al.  Interprocedural Shape Analysis with Separated Heap Abstractions , 2006, SAS.

[7]  James Brotherston,et al.  Cyclic Abduction of Inductively Defined Safety and Termination Preconditions , 2014, SAS.

[8]  Ruzica Piskac,et al.  GRASShopper - Complete Heap Verification with Mixed Specifications , 2014, TACAS.

[9]  Parosh Aziz Abdulla,et al.  Verification of heap manipulating programs with ordered data by extended forest automata , 2015, Acta Informatica.

[10]  Suresh Jagannathan,et al.  Automatically learning shape specifications , 2016, PLDI.

[11]  Peter W. O'Hearn,et al.  Shape Analysis for Composite Data Structures , 2007, CAV.

[12]  Andrew D. Gordon,et al.  Bimodal Modelling of Source Code and Natural Language , 2015, ICML.

[13]  Alexander Aiken,et al.  A Data Driven Approach for Algebraic Loop Invariants , 2013, ESOP.

[14]  Peter Lee,et al.  Automatic numeric abstractions for heap-manipulating programs , 2010, POPL '10.

[15]  John C. Reynolds,et al.  Separation logic: a logic for shared mutable data structures , 2002, Proceedings 17th Annual IEEE Symposium on Logic in Computer Science.

[16]  Neil Immerman,et al.  Effectively-Propositional Reasoning about Reachability in Linked Data Structures , 2013, CAV.

[17]  Dimitrios Vytiniotis,et al.  Under Consideration for Publication in J. Functional Programming Every Bit Counts: the Binary Representation of Typed Data and Programs , 2022 .

[18]  Soonho Kong,et al.  Automatically inferring loop invariants via algorithmic learning , 2015, Math. Struct. Comput. Sci..

[19]  Christof Löding,et al.  ICE: A Robust Framework for Learning Invariants , 2014, CAV.

[20]  Peter W. O'Hearn,et al.  Compositional Shape Analysis by Means of Bi-Abduction , 2011, JACM.

[21]  Yannick Moy,et al.  Modular inference of subprogram contracts for safety checking , 2010, J. Symb. Comput..

[22]  James Brotherston,et al.  A Generic Cyclic Theorem Prover , 2012, APLAS.

[23]  Peter W. O'Hearn,et al.  Local Reasoning about Programs that Alter Data Structures , 2001, CSL.

[24]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[25]  Nikolaj Bjørner,et al.  Property-Directed Shape Analysis , 2014, CAV.

[26]  Frank Piessens,et al.  Learning Assertions to Verify Linked-List Programs , 2015, SEFM.

[27]  Richard S. Zemel,et al.  Gated Graph Sequence Neural Networks , 2015, ICLR.

[28]  Srinath T. V. Setty,et al.  IronFleet: proving practical distributed systems correct , 2015, SOSP.

[29]  Stephen McCamant,et al.  The Daikon system for dynamic detection of likely invariants , 2007, Sci. Comput. Program..

[30]  Jun Sun,et al.  Satisfiability Modulo Heap-Based Programs , 2016, CAV.

[31]  James Brotherston,et al.  Automated Cyclic Entailment Proofs in Separation Logic , 2011, CADE.

[32]  Alfredo Pironti,et al.  Implementing TLS with Verified Cryptographic Security , 2013, 2013 IEEE Symposium on Security and Privacy.

[33]  Kenneth L. McMillan,et al.  Lazy Abstraction with Interpolants , 2006, CAV.

[34]  Pavol Cerný,et al.  Streaming transducers for algorithmic verification of single-pass list-processing programs , 2010, POPL '11.

[35]  Gökhan BakIr,et al.  Predicting Structured Data , 2008 .

[36]  Tomás Vojnar,et al.  Predator: A Practical Tool for Checking Manipulation of Dynamic Data Structures Using Separation Logic , 2011, CAV.

[37]  Reinhard Wilhelm,et al.  Parametric shape analysis via 3-valued logic , 1999, POPL '99.

[38]  Xavier Leroy,et al.  Formal verification of a realistic compiler , 2009, CACM.

[39]  Gernot Heiser,et al.  Comprehensive formal verification of an OS microkernel , 2014, TOCS.

[40]  Andrey Rybalchenko,et al.  Separation Logic Modulo Theories , 2013, APLAS.

[41]  Dan Roth,et al.  Learning invariants using decision trees and implication counterexamples , 2016, POPL.

[42]  Shengchao Qin,et al.  Shape Analysis via Second-Order Bi-Abduction , 2014, CAV.

[43]  Joël Ouaknine,et al.  SeLoger: A Tool for Graph-Based Reasoning in Separation Logic , 2013, CAV.

[44]  Thomas Ball,et al.  Testing, abstraction, theorem proving: better together! , 2006, ISSTA '06.

[45]  Alexander Aiken,et al.  From invariant checking to invariant inference using randomized search , 2014, Formal Methods Syst. Des..