LEGION: Best-First Concolic Testing

Concolic execution and fuzzing are two complementary coverage-based testing techniques. How to achieve the best of both remains an open challenge. To address this research problem, we propose and evaluate Legion. Legion re-engineers the Monte Carlo tree search (MCTS) framework from the AI literature to treat automated test generation as a problem of sequential decision-making under uncertainty. Its best-first search strategy provides a principled way to learn the most promising program states to investigate at each search iteration, based on observed rewards from previous iterations. Legion incorporates a form of directed fuzzing that we call approximate path-preserving fuzzing (APPFUZZING) to investigate program states selected by MCTS. APPFUZZING serves as the Monte Carlo simulation technique and is implemented by extending prior work on constrained sampling. We evaluate Legion against competitors on 2531 benchmarks from the coverage category of Test-Comp 2020, as well as measuring its sensitivity to hyperparameters, demonstrating its effectiveness on a wide variety of input programs.

[1]  Mathias Payer,et al.  T-Fuzz: Fuzzing by Program Transformation , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[2]  Koushik Sen,et al.  Efficient Sampling of SAT Solutions for Testing , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[3]  Lei Ma,et al.  DeepHunter: a coverage-guided fuzz testing framework for deep neural networks , 2019, ISSTA.

[4]  Nicholas Nethercote,et al.  Valgrind: a framework for heavyweight dynamic binary instrumentation , 2007, PLDI '07.

[5]  Tomás Vojnar,et al.  Symbiotic 7: Integration of Predator and More , 2020, TACAS.

[6]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[7]  Meng Xu,et al.  QSYM : A Practical Concolic Execution Engine Tailored for Hybrid Fuzzing , 2018, USENIX Security Symposium.

[8]  Robert O. Hastings,et al.  Fast detection of memory leaks and access errors , 1991 .

[9]  Guy Van den Broeck,et al.  Monte-Carlo Tree Search in Poker Using Expected Reward Distributions , 2009, ACML.

[10]  Shih-Kun Huang,et al.  Path Exploration Based on Monte Carlo Tree Search for Symbolic Execution , 2017, 2017 Conference on Technologies and Applications of Artificial Intelligence (TAAI).

[11]  Koushik Sen,et al.  DART: directed automated random testing , 2005, PLDI '05.

[12]  Christopher Krügel,et al.  Driller: Augmenting Fuzzing Through Selective Symbolic Execution , 2016, NDSS.

[13]  Patrice Godefroid,et al.  SAGE: Whitebox Fuzzing for Security Testing , 2012, ACM Queue.

[14]  Rishabh Singh,et al.  Learn&Fuzz: Machine learning for input fuzzing , 2017, 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[15]  Hoang M. Le,et al.  LLVM-based Hybrid Fuzzing with LibKluzzer (Competition Contribution) , 2020, FASE.

[16]  Yan Shoshitaishvili,et al.  Angr - The Next Generation of Binary Analysis , 2017, 2017 IEEE Cybersecurity Development (SecDev).

[17]  Jared D. DeMott,et al.  Fuzzing for Software Security Testing and Quality Assurance , 2008 .

[18]  Abhik Roychoudhury,et al.  Directed Greybox Fuzzing , 2017, CCS.

[19]  Chao Zhang,et al.  CollAFL: Path Sensitive Fuzzing , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[20]  Joxan Jaffar,et al.  TracerX: Dynamic Symbolic Execution with Interpolation (Competition Contribution) , 2020, FASE.

[21]  Hao Chen,et al.  Angora: Efficient Fuzzing by Principled Search , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[22]  Simon M. Lucas,et al.  A Survey of Monte Carlo Tree Search Methods , 2012, IEEE Transactions on Computational Intelligence and AI in Games.

[23]  Patrice Godefroid,et al.  Automated Whitebox Fuzz Testing , 2008, NDSS.

[24]  Christopher Krügel,et al.  SOK: (State of) The Art of War: Offensive Techniques in Binary Analysis , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[25]  Derek Bruening,et al.  AddressSanitizer: A Fast Address Sanity Checker , 2012, USENIX Annual Technical Conference.

[26]  Peter I. Cowling,et al.  Monte Carlo search applied to card selection in Magic: The Gathering , 2009, 2009 IEEE Symposium on Computational Intelligence and Games.

[27]  Heng Yin,et al.  Send Hardest Problems My Way: Probabilistic Path Prioritization for Hybrid Fuzzing , 2019, NDSS.

[28]  Pieter Spronck,et al.  Monte-Carlo Tree Search in Settlers of Catan , 2009, ACG.

[29]  Malte Lochau,et al.  HybridTiger: Hybrid Model Checking and Domination-based Partitioning for Efficient Multi-Goal Test-Suite Generation (Competition Contribution) , 2020, FASE.

[30]  James C. King,et al.  Symbolic execution and program testing , 1976, CACM.

[31]  Nikolaj Bjørner,et al.  νZ - An Optimizing SMT Solver , 2015, TACAS.

[32]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[33]  Dawson R. Engler,et al.  KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs , 2008, OSDI.

[34]  Dirk Beyer,et al.  Second Competition on Software Testing: Test-Comp 2020 , 2020, FASE.

[35]  Dirk Beyer,et al.  Reliable benchmarking: requirements and solutions , 2017, International Journal on Software Tools for Technology Transfer.

[36]  Jun Sun,et al.  Towards Optimal Concolic Testing , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).