SLF: Fuzzing without Valid Seed Inputs

Fuzzing is an important technique to detect software bugs and vulnerabilities. It works by mutating a small set of seed inputs to generate a large number of new inputs. Fuzzers' performance often substantially degrades when valid seed inputs are not available. Although existing techniques such as symbolic execution can generate seed inputs from scratch, they have various limitations hindering their applications in real-world complex software. In this paper, we propose a novel fuzzing technique that features the capability of generating valid seed inputs. It piggy-backs on AFL to identify input validity checks and the input fields that have impact on such checks. It further classifies these checks according to their relations to the input. Such classes include arithmetic relation, object offset, data structure length and so on. A multi-goal search algorithm is developed to apply class-specific mutations in order to satisfy inter-dependent checks all together. We evaluate our technique on 20 popular benchmark programs collected from other fuzzing projects and the Google fuzzer test suite, and compare it with existing fuzzers AFL and AFLFast, symbolic execution engines KLEE and S2E, and a hybrid tool Driller that combines fuzzing with symbolic execution. The results show that our technique is highly effective and efficient, out-performing the other tools.

[1]  David Brumley,et al.  TIE: Principled Reverse Engineering of Types in Binary Programs , 2011, NDSS.

[2]  Mark Harman,et al.  Reformulating software engineering as a search problem , 2003 .

[3]  Nikolai Tillmann,et al.  Fitness-guided path exploration in dynamic symbolic execution , 2009, 2009 IEEE/IFIP International Conference on Dependable Systems & Networks.

[4]  Nikolai Tillmann,et al.  DyTa: dynamic symbolic execution guided with static verification results , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[5]  Yang Liu,et al.  Skyfire: Data-Driven Seed Generation for Fuzzing , 2017, 2017 IEEE Symposium on Security and Privacy (SP).

[6]  Andreas Zeller,et al.  Search-based system testing: high coverage, no false alarms , 2012, ISSTA 2012.

[7]  David Notkin,et al.  Symstra: A Framework for Generating Object-Oriented Unit Tests Using Symbolic Execution , 2005, TACAS.

[8]  Koushik Sen,et al.  FairFuzz: A Targeted Mutation Strategy for Increasing Greybox Fuzz Testing Coverage , 2017, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[9]  Nikolai Tillmann,et al.  Characteristic studies of loop problems for structural test generation via symbolic execution , 2013, 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[10]  Mark Harman,et al.  A Theoretical and Empirical Study of Search-Based Testing: Local, Global, and Hybrid Search , 2010, IEEE Transactions on Software Engineering.

[11]  Sarfraz Khurshid,et al.  Learning to Accelerate Symbolic Execution via Code Transformation , 2018, ECOOP.

[12]  Mark Harman,et al.  Automated Search for Good Coverage Criteria: Moving from Code Coverage to Fault Coverage through Search-Based Software Engineering , 2016, 2016 IEEE/ACM 9th International Workshop on Search-Based Software Testing (SBST).

[13]  Nikolai Tillmann,et al.  Test generation via Dynamic Symbolic Execution for mutation testing , 2010, 2010 IEEE International Conference on Software Maintenance.

[14]  Gordon Fraser,et al.  Evolutionary Generation of Whole Test Suites , 2011, 2011 11th International Conference on Quality Software.

[15]  George Candea,et al.  S2E: a platform for in-vivo multi-path analysis of software systems , 2011, ASPLOS XVI.

[16]  Sarfraz Khurshid,et al.  Studying the influence of standard compiler optimizations on symbolic execution , 2015, 2015 IEEE 26th International Symposium on Software Reliability Engineering (ISSRE).

[17]  Dawson R. Engler,et al.  KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs , 2008, OSDI.

[18]  Xiangyu Zhang,et al.  Automatic Reverse Engineering of Data Structures from Binary Execution , 2010, NDSS.

[19]  He Jiang,et al.  Search Based Software Engineering , 2013, Lecture Notes in Computer Science.

[20]  Gordon Fraser,et al.  Parameter tuning or default values? An empirical investigation in search-based software engineering , 2013, Empirical Software Engineering.

[21]  Hao Chen,et al.  Angora: Efficient Fuzzing by Principled Search , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[22]  David Brumley,et al.  Optimizing Seed Selection for Fuzzing , 2014, USENIX Security Symposium.

[23]  Gordon Fraser,et al.  On Parameter Tuning in Search Based Software Engineering , 2011, SSBSE.

[24]  Herbert Bos,et al.  Dowsing for Overflows: A Guided Fuzzer to Find Buffer Boundary Violations , 2013, USENIX Security Symposium.

[25]  Rishabh Singh,et al.  Learn&Fuzz: Machine learning for input fuzzing , 2017, 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[26]  Sarfraz Khurshid,et al.  Certified Symbolic Execution , 2016, ATVA.

[27]  Angelos D. Keromytis,et al.  SlowFuzz: Automated Domain-Independent Detection of Algorithmic Complexity Vulnerabilities , 2017, CCS.

[28]  Mark Harman,et al.  An experimental search-based approach to cohesion metric evaluation , 2016, Empirical Software Engineering.

[29]  Gordon Fraser,et al.  The Seed is Strong: Seeding Strategies in Search-Based Software Testing , 2012, 2012 IEEE Fifth International Conference on Software Testing, Verification and Validation.

[30]  Sarfraz Khurshid,et al.  A Synergistic Approach for Distributed Symbolic Execution Using Test Ranges , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering Companion (ICSE-C).

[31]  Sarfraz Khurshid,et al.  Generalized Symbolic Execution for Model Checking and Testing , 2003, TACAS.

[32]  Sarfraz Khurshid,et al.  Directed incremental symbolic execution , 2011, PLDI '11.

[33]  Christopher Krügel,et al.  Driller: Augmenting Fuzzing Through Selective Symbolic Execution , 2016, NDSS.

[34]  Fan Wu,et al.  HOMI: Searching Higher Order Mutants for Software Improvement , 2016, SSBSE.

[35]  Mark Harman,et al.  Search-based software engineering , 2001, Inf. Softw. Technol..

[36]  Abhik Roychoudhury,et al.  Coverage-Based Greybox Fuzzing as Markov Chain , 2017, IEEE Trans. Software Eng..

[37]  Gordon Fraser,et al.  Combining search-based and constraint-based testing , 2011, 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011).

[38]  Abhik Roychoudhury,et al.  Directed Greybox Fuzzing , 2017, CCS.

[39]  Chao Zhang,et al.  CollAFL: Path Sensitive Fuzzing , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[40]  Stephen McCamant,et al.  Loop-extended symbolic execution on binary programs , 2009, ISSTA.

[41]  Sarfraz Khurshid,et al.  Compositional Symbolic Execution with Memoized Replay , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[42]  Darko Marinov,et al.  On test repair using symbolic execution , 2010, ISSTA '10.

[43]  Sarfraz Khurshid,et al.  Symbolic execution for software testing in practice: preliminary assessment , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[44]  Xin Yao,et al.  Software Module Clustering as a Multi-Objective Search Problem , 2011, IEEE Transactions on Software Engineering.