BanditFuzz: Fuzzing SMT Solvers with Reinforcement Learning

Satisfiability Modulo Theories (SMT) solvers are fundamental tools in the broad context of software engineering and security research. If SMT solvers are to continue to have an impact, it is imperative we develop efficient and systematic testing methods for them. To this end, we present a reinforcement learning driven fuzzing system BanditFuzz that zeroes in on the grammatical constructs of well-formed solver inputs that are the root cause of performance or correctness issues in solvers-under-test. To the best of our knowledge, BanditFuzz is the first machine-learning based fuzzer for SMT solvers. BanditFuzz takes as input a grammar G describing the well-formed inputs to a set of distinct solvers (say, P1 and P2) that implement the same specification and a fuzzing objective (e.g., maximize the relative performance difference between P1 and P2), and outputs a ranked list of grammatical constructs that are likely to maximize performance differences between P1 and P2 or are root causes of errors in these solvers. Typically, mutation fuzzing is implemented as a set of random mutations applied to a given input. By contrast, the key innovation behind BanditFuzz is the modeling of a grammar-preserving fuzzing mutator as a reinforcement learning (RL) agent that, via blackbox interactions with programs-under-test, learns which grammatical constructs are most likely the cause of an error or performance issue. Using BanditFuzz, we discovered 1,700 syntactically unique inputs resulting in inconsistent answers across state-of-the-art SMT solvers Z3, CVC4, Colibri, MathSAT, and Z3str3 over the floating-point and string SMT theories. Further, using BanditFuzz, we constructed two benchmark suites (with 400 floatingpoint and 300 string instances) that expose performance issues in all considered solvers. We also performed a comparison of BanditFuzz against random, mutation, and evolutionary fuzzing methods. We observed up to a 31% improvement in performance fuzzing and up to 81% improvement in the number of bugs found by BanditFuzz relative to these other methods for the same amount of time provided to all methods.

[1]  Rishabh Singh,et al.  Deep Reinforcement Fuzzing , 2018, 2018 IEEE Security and Privacy Workshops (SPW).

[2]  Andreas Zeller,et al.  Simplifying and Isolating Failure-Inducing Input , 2002, IEEE Trans. Software Eng..

[3]  Cyrille Artho,et al.  Iterative delta debugging , 2009, International Journal on Software Tools for Technology Transfer.

[4]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[5]  Nikolaj Bjørner,et al.  Z3: An Efficient SMT Solver , 2008, TACAS.

[6]  Benjamin Van Roy,et al.  A Tutorial on Thompson Sampling , 2017, Found. Trends Mach. Learn..

[7]  Cesare Tinelli,et al.  An Automatable Formal Semantics for IEEE-754 Floating-Point Arithmetic , 2015, 2015 IEEE 22nd Symposium on Computer Arithmetic.

[8]  Choongwoo Han,et al.  Fuzzing: Art, Science, and Engineering , 2018, ArXiv.

[9]  Vidroha Debroy,et al.  Genetic Programming , 1998, Lecture Notes in Computer Science.

[10]  Yunhui Zheng,et al.  ZSstrS: A string solver with theory-aware heuristics , 2017, 2017 Formal Methods in Computer Aided Design (FMCAD).

[11]  Aina Niemetz ddSMT : A Delta Debugger for the SMT-LIB v 2 Format ∗ , 2013 .

[12]  David S. Rosenberg,et al.  Adaptive Grey-Box Fuzz-Testing with Thompson Sampling , 2018, AISec@CCS.

[13]  James Demmel,et al.  IEEE Standard for Floating-Point Arithmetic , 2008 .

[14]  Zhendong Su,et al.  HDD: hierarchical delta debugging , 2006, ICSE.

[15]  François Bobot,et al.  Real Behavior of Floating Point , 2017, SMT.

[16]  David Brumley,et al.  Program-Adaptive Mutational Fuzzing , 2015, 2015 IEEE Symposium on Security and Privacy.

[17]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[18]  Richard J. Cleary Handbook of Beta Distribution and Its Applications , 2006 .

[19]  Sumit Gulwani,et al.  Program analysis as constraint solving , 2008, PLDI '08.

[20]  Dawson R. Engler,et al.  EXE: automatically generating inputs of death , 2006, CCS '06.

[21]  Federico Mora,et al.  StringFuzz: A Fuzzer for String Solvers , 2018, CAV.

[22]  Adam Kiezun,et al.  Grammar-based whitebox fuzzing , 2008, PLDI '08.

[23]  Christoph Weidenbach,et al.  A Verified SAT Solver Framework with Learn, Forget, Restart, and Incrementality , 2016, Journal of Automated Reasoning.

[24]  Susan Baldwin,et al.  Compute Canada: Advancing Computational Research , 2012 .

[25]  Csaba Szepesvári,et al.  Algorithms for Reinforcement Learning , 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[26]  Annibale Panichella,et al.  A Machine-Learning-Driven Evolutionary Approach for Testing Web Application Firewalls , 2018, IEEE Transactions on Reliability.

[27]  Herbert Bos,et al.  VUzzer: Application-aware Evolutionary Fuzzing , 2017, NDSS.

[28]  David Brumley,et al.  Optimizing Seed Selection for Fuzzing , 2014, USENIX Security Symposium.

[29]  Pedram Amini,et al.  Fuzzing: Brute Force Vulnerability Discovery , 2007 .

[30]  Zachary N. J. Peterson,et al.  Analysis of Mutation and Generation-Based Fuzzing , 2007 .

[31]  Philipp Rümmer,et al.  An SMT-LIB Theory of Binary Floating-Point Arithmetic ∗ , 2010 .

[32]  K. Rustan M. Leino,et al.  The Boogie Verification Debugger (Tool Paper) , 2011, SEFM.

[33]  David Brumley,et al.  Scheduling black-box mutational fuzzing , 2013, CCS.

[34]  Yang Liu,et al.  Skyfire: Data-Driven Seed Generation for Fuzzing , 2017, 2017 IEEE Symposium on Security and Privacy (SP).

[35]  Armin Biere,et al.  Fuzzing and delta-debugging SMT solvers , 2009, SMT '09.

[36]  Jared D. DeMott,et al.  Fuzzing for Software Security Testing and Quality Assurance , 2008 .

[37]  Groupe Pdmia Markov Decision Processes In Artificial Intelligence , 2009 .

[38]  Alberto Griggio,et al.  The MathSAT5 SMT Solver , 2013, TACAS.

[39]  Roger Lee,et al.  A Framework for File Format Fuzzing with Genetic Algorithms , 2012 .

[40]  Aditya Kanade,et al.  Greybox fuzzing as a contextual bandits problem , 2018, ArXiv.