Tests from Witnesses - Execution-Based Validation of Verification Results

The research community made enormous progress in the past years in developing algorithms for verifying software, as shown by international competitions. Unfortunately, the transfer into industrial practice is slow. A reason for this might be that the verification tools do not connect well to the developer work-flow. This paper presents a solution to this problem: We use verification witnesses as interface between verification tools and the testing process that every developer is familiar with. Many modern verification tools report, in case a bug is found, an error path as exchangeable verification witness. Our approach is to synthesize a test from each witness, such that the developer can inspect the verification result using familiar technology, such as debuggers, profilers, and visualization tools. Moreover, this approach identifies the witnesses as an interface between formal verification and testing: Developers can use arbitrary (witness-producing) verification tools, and arbitrary converters from witnesses to tests; we implemented two such converters. We performed a large experimental study to confirm that our proposed solution works well in practice: Out of 18 966 verification results obtained from 21 verifiers, 14 727 results were confirmed by witness-based result validation, and 10 080 of these results were confirmed alone by extracting and executing tests, meaning that the desired specification violation was effectively observed. We thus show that our approach is directly and immediately applicable to verification results produced by software verifiers that adhere to the international standard for verification witnesses.

[1]  Fred B. Schneider,et al.  Enforceable security policies , 2000, Foundations of Intrusion Tolerant Systems, 2003 [Organically Assured and Survivable Information Systems].

[2]  Yannis Smaragdakis,et al.  Check 'n' crash: combining static checking and testing , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[3]  Thomas A. Henzinger,et al.  SYNERGY: a new algorithm for property checking , 2006, SIGSOFT '06/FSE-14.

[4]  Dirk Beyer,et al.  Witness validation and stepwise testification across software verifiers , 2015, Software Engineering.

[5]  Yannis Smaragdakis,et al.  Residual investigation: predictive and precise bug detection , 2012, ISSTA 2012.

[6]  Lucas C. Cordeiro,et al.  Understanding Programming Bugs in ANSI-C Software Using Bounded Model Checking Counter-Examples , 2012, IFM.

[7]  Dirk Beyer,et al.  Software Verification: Testing vs. Model Checking - A Comparative Evaluation of the State of the Art , 2017, Haifa Verification Conference.

[8]  Dirk Beyer,et al.  CPAchecker: A Tool for Configurable Software Verification , 2009, CAV.

[9]  Vladimír Still,et al.  Optimizing and Caching SMT Queries in SymDIVINE - (Competition Contribution) , 2017, TACAS.

[10]  Jan Strejcek,et al.  Symbiotic 4: Beyond Reachability - (Competition Contribution) , 2017, TACAS.

[11]  Lucas C. Cordeiro,et al.  ESBMC 1.22 - (Competition Contribution) , 2014, TACAS.

[12]  Dawson R. Engler,et al.  EXE: automatically generating inputs of death , 2006, CCS '06.

[13]  Dirk Beyer,et al.  Reliable benchmarking: requirements and solutions , 2017, International Journal on Software Tools for Technology Transfer.

[14]  Vadim S. Mutilin,et al.  CPA-BAM-BnB: Block-Abstraction Memoization and Region-Based Memory Models for Predicate Abstractions - (Competition Contribution) , 2017, TACAS.

[15]  Doron A. Peled,et al.  Path Exploration Tool , 1999, TACAS.

[16]  Ulrik Brandes,et al.  GraphML Progress Report , 2001, GD.

[17]  Christian Bird,et al.  What developers want and need from program analysis: An empirical study , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[18]  Vadim S. Mutilin,et al.  Predicate Analysis with BLAST 2.7 - (Competition Contribution) , 2012, TACAS.

[19]  Matthias Dangl,et al.  CPAchecker with Support for Recursive Programs and Floating-Point Arithmetic - (Competition Contribution) , 2015, TACAS.

[20]  Lucas C. Cordeiro,et al.  DepthK: A k-Induction Verifier Based on Invariant Inference for C Programs - (Competition Contribution) , 2017, TACAS.

[21]  Dirk Beyer,et al.  Boosting k-Induction with Continuously-Refined Invariants , 2015, CAV.

[22]  Daniel Kroening,et al.  2LS for Program Analysis - (Competition Contribution) , 2016, TACAS.

[23]  Dirk Beyer,et al.  Reuse of Verification Results - Conditional Model Checking, Precision Reuse, and Verification Witnesses , 2013, SPIN.

[24]  Christian Schilling,et al.  Ultimate Taipan: Trace Abstraction and Abstract Interpretation - (Competition Contribution) , 2017, TACAS.

[25]  Rupak Majumdar,et al.  Hybrid Concolic Testing , 2007, 29th International Conference on Software Engineering (ICSE'07).

[26]  Dirk Beyer,et al.  Verification-Aided Debugging: An Interactive Web-Service for Exploring Error Witnesses , 2016, CAV.

[27]  Christian Schilling,et al.  Ultimate Automizer with an On-Demand Construction of Floyd-Hoare Automata - (Competition Contribution) , 2017, TACAS.

[28]  Koushik Sen DART: Directed Automated Random Testing , 2009, Haifa Verification Conference.

[29]  Tomás Vojnar,et al.  Optimized PredatorHP and the SV-COMP Heap and Memory Safety Benchmark - (Competition Contribution) , 2016, TACAS.

[30]  Andreas Podelski,et al.  ULTIMATE KOJAK with Memory Safety Checks - (Competition Contribution) , 2015, TACAS.

[31]  Zvonimir Rakamaric,et al.  SMACK: Decoupling Source Language Details from Verifier Implementations , 2014, CAV.

[32]  Heike Wehrheim,et al.  Compact Proof Witnesses , 2017, NFM.

[33]  Dirk Beyer,et al.  Software Verification with Validation of Results - (Report on SV-COMP 2017) , 2017, TACAS.

[34]  Helmut Veith,et al.  How did you specify your test suite , 2010, ASE.

[35]  Thomas A. Henzinger,et al.  Generating tests from counterexamples , 2004, Proceedings. 26th International Conference on Software Engineering.

[36]  Dirk Beyer,et al.  Correctness witnesses: exchanging verification results between verifiers , 2016, SIGSOFT FSE.

[37]  Lucas C. Cordeiro,et al.  Handling loops in bounded model checking of C programs via k-induction , 2015, International Journal on Software Tools for Technology Transfer.

[38]  Peter Müller,et al.  Using Debuggers to Understand Failed Verification Attempts , 2011 .

[39]  Sebastián Uchitel,et al.  Model checker execution reports , 2016, 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[40]  Koushik Sen,et al.  CUTE: a concolic unit testing engine for C , 2005, ESEC/FSE-13.

[41]  Helmut Veith,et al.  Counterexample-guided abstraction refinement for symbolic model checking , 2003, JACM.

[42]  Lukás Holík,et al.  Forester: From Heap Shapes to Automata Predicates - (Competition Contribution) , 2017, TACAS.

[43]  Anthony M. Sloane,et al.  Skink: Static Analysis of Programs in LLVM Intermediate Representation - (Competition Contribution) , 2017, TACAS.

[44]  Daniel Kroening,et al.  CBMC - C Bounded Model Checker - (Competition Contribution) , 2014, TACAS.

[45]  Dirk Beyer,et al.  Reliable and Reproducible Competition Results with BenchExec and Witnesses (Report on SV-COMP 2016) , 2016, TACAS.

[46]  Cyrille Artho,et al.  Visualization of Concurrent Program Executions , 2007, 31st Annual International Computer Software and Applications Conference (COMPSAC 2007).

[47]  Sarfraz Khurshid,et al.  Test input generation with java PathFinder , 2004, ISSTA '04.

[48]  Daniel Kroening,et al.  Making Software Verification Tools Really Work , 2011, ATVA.