Empirical software metrics for benchmarking of verification tools

We study empirical metrics for software source code, which can predict the performance of verification tools on specific types of software. Our metrics comprise variable usage patterns, loop patterns, as well as indicators of control-flow complexity and are extracted by simple data-flow analyses. We demonstrate that our metrics are powerful enough to devise a machine-learning based portfolio solver for software verification. We show that this portfolio solver would be the (hypothetical) overall winner of the international competition on software verification (SV-COMP) in three consecutive years (2014–2016). This gives strong empirical evidence for the predictive power of our metrics and demonstrates the viability of portfolio solvers for software verification. Moreover, we demonstrate the flexibility of our algorithm for portfolio construction in novel settings: originally conceived for SV-COMP’14, the construction works just as well for SV-COMP’15 (considerably more verification tasks) and for SV-COMP’16 (considerably more candidate verification tools).

[1]  Horst Samulowitz,et al.  Learning to Solve QBF , 2007, AAAI.

[2]  Aditya Kanade,et al.  MUX: algorithm selection for software model checkers , 2014, MSR 2014.

[3]  Yi-Min Huang,et al.  Weighted support vector machine for classification with uneven training class sizes , 2005, 2005 International Conference on Machine Learning and Cybernetics.

[4]  Luca Pulina,et al.  The Multi-Engine ASP Solver me-asp , 2012, JELIA.

[5]  Helmut Veith,et al.  Loop Patterns in C Programs , 2015 .

[6]  Bart Selman,et al.  Algorithm portfolios , 2001, Artif. Intell..

[7]  Tomás Vojnar,et al.  Byte-Precise Verification of Low-Level List Manipulation , 2013, SAS.

[8]  A.M. Stavely Verifying Definite Iteration Over Data Structures , 1995, IEEE Trans. Software Eng..

[9]  Dirk Beyer,et al.  Status Report on Software Verification - (Competition Summary SV-COMP 2014) , 2014, TACAS.

[10]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[11]  John R. Rice,et al.  The Algorithm Selection Problem , 1976, Adv. Comput..

[12]  Kevin Leyton-Brown,et al.  SATzilla: Portfolio-based Algorithm Selection for SAT , 2008, J. Artif. Intell. Res..

[13]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[14]  Kevin Leyton-Brown,et al.  Evaluating Component Solver Contributions to Portfolio-Based Algorithm Selectors , 2012, SAT.

[15]  Arie Gurfinkel,et al.  FrankenBit: Bit-Precise Verification with Many Bits - (Competition Contribution) , 2014, TACAS.

[16]  Marius Thomas Lindauer,et al.  A Portfolio Solver for Answer Set Programming: Preliminary Report , 2011, LPNMR.

[17]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[18]  Michel Lemaître,et al.  Branch and Bound Algorithm Selection by Performance Prediction , 1998, AAAI/IAAI.

[19]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[20]  Dirk Beyer,et al.  Reliable and Reproducible Competition Results with BenchExec and Witnesses (Report on SV-COMP 2016) , 2016, TACAS.

[21]  Chih-Jen Lin,et al.  Probability Estimates for Multi-class Classification by Pairwise Coupling , 2003, J. Mach. Learn. Res..

[22]  Yuri Malitsky,et al.  Algorithm Selection and Scheduling , 2011, CP.

[23]  Andreas Podelski,et al.  Tools and algorithms for the construction and analysis of systems , 2006, International Journal on Software Tools for Technology Transfer.

[24]  Helmut Veith,et al.  On the concept of variable roles and its use in software analysis , 2013, 2013 Formal Methods in Computer-Aided Design.

[25]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[26]  Luca Pulina,et al.  A Multi-engine Solver for Quantified Boolean Formulas , 2007, CP.

[27]  Thomas A. Henzinger,et al.  Configurable Software Verification: Concretizing the Convergence of Model Checking and Program Analysis , 2007, CAV.

[28]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[29]  Helmut Veith,et al.  Empirical software metrics for benchmarking of verification tools , 2015, Formal Methods in System Design.

[30]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[31]  Tad Hogg,et al.  An Economics Approach to Hard Computational Problems , 1997, Science.

[32]  Daniel Kroening,et al.  A Tool for Checking ANSI-C Programs , 2004, TACAS.

[33]  Dirk Beyer Software Verification and Verifiable Witnesses - (Report on SV-COMP 2015) , 2015, TACAS.

[34]  Luca Pulina,et al.  A self-adaptive multi-engine solver for quantified Boolean formulas , 2009, Constraints.

[35]  Haibo He,et al.  Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.