Relational verification using reinforcement learning

Relational verification aims to prove properties that relate a pair of programs or two different runs of the same program. While relational properties (e.g., equivalence, non-interference) can be verified by reducing them to standard safety, there are typically many possible reduction strategies, only some of which result in successful automated verification. Motivated by this problem, we propose a novel relational verification algorithm that learns useful reduction strategies using reinforcement learning. Specifically, we show how to formulate relational verification as a Markov Decision Process (MDP) and use reinforcement learning to synthesize an optimal policy for the underlying MDP. The learned policy is then used to guide the search for a successful verification strategy. We have implemented this approach in a tool called Coeus and evaluate it on two benchmark suites. Our evaluation shows that Coeus solves significantly more problems within a given time limit compared to multiple baselines, including two state-of-the-art relational verification tools.

[1]  J. Meseguer,et al.  Security Policies and Security Models , 1982, 1982 IEEE Symposium on Security and Privacy.

[2]  Jude W. Shavlik,et al.  Interpretation of Artificial Neural Networks: Mapping Knowledge-Based Neural Networks into Rules , 1991, NIPS.

[3]  Lonnie Chrisman,et al.  Reinforcement Learning with Perceptual Aliasing: The Perceptual Distinctions Approach , 1992, AAAI.

[4]  Andrew McCallum,et al.  Overcoming Incomplete Perception with Utile Distinction Memory , 1993, ICML.

[5]  Amir Pnueli,et al.  Translation Validation , 1998, TACAS.

[6]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[7]  K. Rustan M. Leino,et al.  Houdini, an Annotation Assistant for ESC/Java , 2001, FME.

[8]  Anna Philippou,et al.  Tools and Algorithms for the Construction and Analysis of Systems , 2018, Lecture Notes in Computer Science.

[9]  Nick Benton,et al.  Simple relational correctness proofs for static analyses and program transformations , 2004, POPL.

[10]  Michael I. Jordan,et al.  Scalable statistical bug isolation , 2005, PLDI '05.

[11]  Alexander Aiken,et al.  Secure Information Flow as a Safety Problem , 2005, SAS.

[12]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[13]  Dawson R. Engler,et al.  From uncertainty to belief: inferring the specification within , 2006, OSDI '06.

[14]  Hongseok Yang,et al.  Relational separation logic , 2007, Theor. Comput. Sci..

[15]  Michael R. Clarkson,et al.  Hyperproperties , 2008, 2008 21st IEEE Computer Security Foundations Symposium.

[16]  Amir Pnueli,et al.  CoVaC: Compiler Validation by Program Analysis of the Cross-Product , 2008, FM.

[17]  Xavier Leroy,et al.  Formal verification of a realistic compiler , 2009, CACM.

[18]  Benjamin Livshits,et al.  Merlin: specification inference for explicit information flow problems , 2009, PLDI '09.

[19]  Mayur Naik,et al.  Learning minimal abstractions , 2011, POPL '11.

[20]  Aditya V. Nori,et al.  Probabilistic, modular and scalable inference of typestate specifications , 2011, PLDI '11.

[21]  Gilles Barthe,et al.  Relational Verification Using Product Programs , 2011, FM.

[22]  Pedro R. D'Argenio,et al.  Secure information flow by self-composition , 2004, Proceedings. 17th IEEE Computer Security Foundations Workshop, 2004..

[23]  Dan Quinlan,et al.  The ROSE Source-to-Source Compiler Infrastructure , 2011 .

[24]  Gilles Barthe,et al.  Probabilistic relational reasoning for differential privacy , 2012, POPL '12.

[25]  Shuvendu K. Lahiri,et al.  SYMDIFF: A Language-Agnostic Semantic Diff Tool for Imperative Programs , 2012, CAV.

[26]  Alexander Aiken,et al.  Stochastic superoptimization , 2012, ASPLOS '13.

[27]  Alexander Aiken,et al.  Verification as Learning Geometric Concepts , 2013, SAS.

[28]  Shuvendu K. Lahiri,et al.  Differential assertion checking , 2013, ESEC/FSE 2013.

[29]  Stochastic optimization of floating-point programs with tunable precision , 2014, PLDI.

[30]  Honglak Lee,et al.  Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning , 2014, NIPS.

[31]  Alexander Aiken,et al.  From invariant checking to invariant inference using randomized search , 2014, Formal Methods Syst. Des..

[32]  Martin T. Vechev,et al.  Code completion with statistical language models , 2014, ACM-SIGPLAN Symposium on Programming Language Design and Implementation.

[33]  Nikolaj Bjørner,et al.  Horn Clause Solvers for Program Verification , 2015, Fields of Logic and Computation II.

[34]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.

[35]  Vladimir Klebanov,et al.  Automating regression verification , 2014, Software Engineering & Management.

[36]  Jorge A. Navas,et al.  SeaHorn: A Framework for Verifying C Programs (Competition Contribution) , 2015, TACAS.

[37]  Xin Zhang,et al.  A user-guided approach to program analysis , 2015, ESEC/SIGSOFT FSE.

[38]  Andreas Krause,et al.  Predicting Program Properties from "Big Code" , 2015, POPL.

[39]  Isil Dillig,et al.  Cartesian hoare logic for verifying k-safety properties , 2016, PLDI.

[40]  Alexander Aiken,et al.  Stratified synthesis: automatically learning the x86-64 instruction set , 2016, PLDI.

[41]  Alberto Pettorossi,et al.  Relational Verification Through Horn Clause Transformation , 2016, SAS.

[42]  Gilles Barthe,et al.  Product programs and relational program logics , 2016, J. Log. Algebraic Methods Program..

[43]  Martin T. Vechev,et al.  Probabilistic model for code with decision trees , 2016, OOPSLA.

[44]  Isil Dillig,et al.  Component-based synthesis of table consolidation and transformation tasks from examples , 2016, PLDI.

[45]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[46]  Martin T. Vechev,et al.  PHOG: Probabilistic Model for Code , 2016, ICML.

[47]  Alexander Aiken,et al.  Synthesizing program input grammars , 2016, PLDI.

[48]  Sagar Chaki,et al.  SMT-based model checking for recursive programs , 2014, Formal Methods in System Design.

[49]  Todd D. Millstein,et al.  Data-driven precondition inference with learned features , 2016, PLDI.

[50]  Fabio Fioravanti,et al.  Horn clause transformation for program verification , 2016 .

[51]  Andreas Krause,et al.  Learning programs from noisy data , 2016, POPL.

[52]  Sergey Levine,et al.  Guided Policy Search via Approximate Mirror Descent , 2016, NIPS.

[53]  Alexander Aiken,et al.  Minimizing GUI event traces , 2016, SIGSOFT FSE.

[54]  Josef Urban,et al.  DeepMath - Deep Sequence Models for Premise Selection , 2016, NIPS.

[55]  Grigory Fedyukovich,et al.  Synchronizing Constrained Horn Clauses , 2017, LPAR.

[56]  Jian Wang,et al.  Premise Selection for Theorem Proving by Deep Graph Embedding , 2017, NIPS.

[57]  Martin T. Vechev,et al.  Learning a Static Analyzer from Data , 2016, CAV.

[58]  Isil Dillig,et al.  Precise Detection of Side-Channel Vulnerabilities using Quantitative Cartesian Hoare Logic , 2017, CCS.

[59]  Rishabh Singh,et al.  Learn&Fuzz: Machine learning for input fuzzing , 2017, 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[60]  Wenhan Xiong,et al.  DeepPath: A Reinforcement Learning Method for Knowledge Graph Reasoning , 2017, EMNLP.

[61]  Sebastian Nowozin,et al.  DeepCoder: Learning to Write Programs , 2016, ICLR.

[62]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[63]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[64]  Rajeev Alur,et al.  Accelerating search-based program synthesis using learned probabilistic models , 2018, PLDI.

[65]  Isil Dillig,et al.  Verifying Semantic Conflict-Freedom in Three-Way Program Merges , 2018, ArXiv.

[66]  Sumit Gulwani,et al.  Neural-Guided Deductive Search for Real-Time Program Synthesis from Examples , 2018, ICLR.

[67]  Armando Solar-Lezama,et al.  Verifiable Reinforcement Learning via Policy Extraction , 2018, NeurIPS.

[68]  Le Song,et al.  Learning Loop Invariants for Program Verification , 2018, NeurIPS.

[69]  Isil Dillig,et al.  Program synthesis using conflict-driven learning , 2017, PLDI.

[70]  Alexander Aiken,et al.  Active learning of points-to specifications , 2017, PLDI.

[71]  Peter Müller,et al.  Modular Product Programs , 2018, ESOP.

[72]  Mukund Raghothaman,et al.  User-guided program reasoning using Bayesian inference , 2018, PLDI.

[73]  Markus Püschel,et al.  Fast Numerical Program Analysis with Reinforcement Learning , 2018, CAV.

[74]  Le Song,et al.  Learning a Meta-Solver for Syntax-Guided Program Synthesis , 2018, ICLR.