GRIMOIRE: Synthesizing Structure while Fuzzing

In the past few years, fuzzing has received significant attention from the research community. However, most of this attention was directed towards programs without a dedicated parsing stage. In such cases, fuzzers which leverage the input structure of a program can achieve a significantly higher code coverage compared to traditional fuzzing approaches. This advancement in coverage is achieved by applying large-scale mutations in the application’s input space. However, this improvement comes at the cost of requiring expert domain knowledge, as these fuzzers depend on structure input specifications (e. g., grammars). Grammar inference, a technique which can automatically generate such grammars for a given program, can be used to address this shortcoming. Such techniques usually infer a program’s grammar in a pre-processing step and can miss important structures that are uncovered only later during normal fuzzing. In this paper, we present the design and implementation of GRIMOIRE, a fully automated coverage-guided fuzzer which works without any form of human interaction or preconfiguration; yet, it is still able to efficiently test programs that expect highly structured inputs. We achieve this by performing large-scale mutations in the program input space using grammar-like combinations to synthesize new highly structured inputs without any pre-processing step. Our evaluation shows that GRIMOIRE outperforms other coverageguided fuzzers when fuzzing programs with highly structured inputs. Furthermore, it improves upon existing grammarbased coverage-guided fuzzers. Using GRIMOIRE, we identified 19 distinct memory corruption bugs in real-world programs and obtained 11 new CVEs.

[1]  Armin Biere,et al.  Boolector 2 . 0 system description , 2015 .

[2]  Mathias Payer,et al.  T-Fuzz: Fuzzing by Program Transformation , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[3]  Dawson R. Engler,et al.  KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs , 2008, OSDI.

[4]  Xuejun Yang,et al.  Finding and understanding bugs in C compilers , 2011, PLDI '11.

[5]  Adam Kiezun,et al.  Grammar-based whitebox fuzzing , 2008, PLDI '08.

[6]  David Brumley,et al.  Optimizing Seed Selection for Fuzzing , 2014, USENIX Security Symposium.

[7]  Andreas Zeller,et al.  Sample-Free Learning of Input Grammars for Comprehensive Software Fuzzing , 2018, ArXiv.

[8]  Rishabh Singh,et al.  Learn&Fuzz: Machine learning for input fuzzing , 2017, 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[9]  Alexander Aiken,et al.  Synthesizing program input grammars , 2016, PLDI.

[10]  Ahmad-Reza Sadeghi,et al.  NAUTILUS: Fishing for Deep Bugs with Grammars , 2019, NDSS.

[11]  David Brumley,et al.  Scheduling black-box mutational fuzzing , 2013, CCS.

[12]  Herbert Bos,et al.  IFuzzer: An Evolutionary Interpreter Fuzzer Using Genetic Programming , 2016, ESORICS.

[13]  Sang Kil Cha,et al.  CodeAlchemist: Semantics-Aware Code Generation to Find Vulnerabilities in JavaScript Engines , 2019, NDSS.

[14]  Will Drewry,et al.  Flayer: Exposing Application Internals , 2007, WOOT.

[15]  Heng Yin,et al.  Send Hardest Problems My Way: Probabilistic Path Prioritization for Hybrid Fuzzing , 2019, NDSS.

[16]  Herbert Bos,et al.  VUzzer: Application-aware Evolutionary Fuzzing , 2017, NDSS.

[17]  Andreas Zeller,et al.  Fuzzing with Code Fragments , 2012, USENIX Security Symposium.

[18]  David Brumley,et al.  Program-Adaptive Mutational Fuzzing , 2015, 2015 IEEE Symposium on Security and Privacy.

[19]  Koushik Sen DART: Directed Automated Random Testing , 2009, Haifa Verification Conference.

[20]  Shih-Kun Huang,et al.  INSTRIM: Lightweight Instrumentation for Coverage-guided Fuzzing , 2018 .

[21]  Martin C. Rinard,et al.  Taint-based directed whitebox fuzzing , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[22]  Sang Kil Cha,et al.  IMF: Inferred Model-based Fuzzer , 2017, CCS.

[23]  Abhik Roychoudhury,et al.  Coverage-Based Greybox Fuzzing as Markov Chain , 2016, IEEE Transactions on Software Engineering.

[24]  Thorsten Holz,et al.  REDQUEEN: Fuzzing with Input-to-State Correspondence , 2019, NDSS.

[25]  Meng Xu,et al.  QSYM : A Practical Concolic Execution Engine Tailored for Hybrid Fuzzing , 2018, USENIX Security Symposium.

[26]  Wen Xu,et al.  Designing New Operating Primitives to Improve Fuzzing Performance , 2017, CCS.

[27]  Eric Jones,et al.  SciPy: Open Source Scientific Tools for Python , 2001 .

[28]  Yang Liu,et al.  Steelix: program-state based binary fuzzing , 2017, ESEC/SIGSOFT FSE.

[29]  Guofei Gu,et al.  TaintScope: A Checksum-Aware Directed Fuzzing Tool for Automatic Software Vulnerability Detection , 2010, 2010 IEEE Symposium on Security and Privacy.

[30]  Abhik Roychoudhury,et al.  Directed Greybox Fuzzing , 2017, CCS.

[31]  Chao Zhang,et al.  CollAFL: Path Sensitive Fuzzing , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[32]  Sebastian Schinzel,et al.  kAFL: Hardware-Assisted Feedback Fuzzing for OS Kernels , 2017, USENIX Security Symposium.

[33]  Andreas Zeller,et al.  Mining input grammars from dynamic taints , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[34]  Patrice Godefroid,et al.  Automated Whitebox Fuzz Testing , 2008, NDSS.

[35]  Herbert Bos,et al.  Dowsing for Overflows: A Guided Fuzzer to Find Buffer Boundary Violations , 2013, USENIX Security Symposium.

[36]  David Brumley,et al.  Unleashing Mayhem on Binary Code , 2012, 2012 IEEE Symposium on Security and Privacy.

[37]  Christopher Krügel,et al.  Driller: Augmenting Fuzzing Through Selective Symbolic Execution , 2016, NDSS.