Superion: Grammar-Aware Greybox Fuzzing

In recent years, coverage-based greybox fuzzing has proven itself to be one of the most effective techniques for finding security bugs in practice. Particularly, American Fuzzy Lop (AFL for short) is deemed to be a great success in fuzzing relatively simple test inputs. Unfortunately, when it meets structured test inputs such as XML and JavaScript, those grammar-blind trimming and mutation strategies in AFL hinder the effectiveness and efficiency. To this end, we propose a grammar-aware coverage-based greybox fuzzing approach to fuzz programs that process structured inputs. Given the grammar (which is often publicly available) of test inputs, we introduce a grammar-aware trimming strategy to trim test inputs at the tree level using the abstract syntax trees (ASTs) of parsed test inputs. Further, we introduce two grammar-aware mutation strategies (i.e., enhanced dictionary-based mutation and tree-based mutation). Specifically, tree-based mutation works via replacing subtrees using the ASTs of parsed test inputs. Equipped with grammar-awareness, our approach can carry the fuzzing exploration into width and depth. We implemented our approach as an extension to AFL, named Superion; and evaluated the effectiveness of Superion using large- scale programs (i.e., an XML engine libplist and three JavaScript engines WebKit, Jerryscript and ChakraCore). Our results have demonstrated that Superion can improve the code coverage (i.e., 16.7% and 8.8% in line and function coverage) and bug-finding capability (i.e., 34 new bugs, among which we discovered 22 new vulnerabilities with 19 CVEs assigned and 3.2K USD bug bounty rewards received) over AFL and jsfunfuzz.

[1]  Andreas Zeller,et al.  Mining Input Grammars with AUTOGRAM , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering Companion (ICSE-C).

[2]  Rishabh Singh,et al.  Learn&Fuzz: Machine learning for input fuzzing , 2017, 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[3]  Andrew Ruef,et al.  Evaluating Fuzz Testing , 2018, CCS.

[4]  Juha Röning,et al.  Experiences with Model Inference Assisted Fuzzing , 2008, WOOT.

[5]  Christopher Krügel,et al.  Driller: Augmenting Fuzzing Through Selective Symbolic Execution , 2016, NDSS.

[6]  Yang Liu,et al.  BinGo: cross-architecture cross-OS binary search , 2016, SIGSOFT FSE.

[7]  Hao Chen,et al.  Angora: Efficient Fuzzing by Principled Search , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[8]  Patrice Godefroid,et al.  SAGE: Whitebox Fuzzing for Security Testing , 2012, ACM Queue.

[9]  Bihuan Chen,et al.  Hawkeye: Towards a Desired Directed Grey-box Fuzzer , 2018, CCS.

[10]  Zhendong Su,et al.  Compiler validation via equivalence modulo inputs , 2014, PLDI.

[11]  Yang Liu,et al.  Steelix: program-state based binary fuzzing , 2017, ESEC/SIGSOFT FSE.

[12]  ChenYang,et al.  Taming compiler fuzzers , 2013 .

[13]  Yang Liu,et al.  Skyfire: Data-Driven Seed Generation for Fuzzing , 2017, 2017 IEEE Symposium on Security and Privacy (SP).

[14]  Herbert Bos,et al.  The BORG: Nanoprobing Binaries for Buffer Overreads , 2015, CODASPY.

[15]  Lionel C. Briand,et al.  Automated testing for SQL injection vulnerabilities: an input mutation approach , 2014, ISSTA 2014.

[16]  David Brumley,et al.  Scheduling black-box mutational fuzzing , 2013, CCS.

[17]  Patrice Godefroid,et al.  Automated Whitebox Fuzz Testing , 2008, NDSS.

[18]  Angelos D. Keromytis,et al.  SlowFuzz: Automated Domain-Independent Detection of Algorithmic Complexity Vulnerabilities , 2017, CCS.

[19]  Martin C. Rinard,et al.  Taint-based directed whitebox fuzzing , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[20]  Nahid Shahmehri,et al.  Turning programs against each other: high coverage fuzz-testing using binary-code mutation and dynamic slicing , 2015, ESEC/SIGSOFT FSE.

[21]  Yang Liu,et al.  Collaborative Security , 2015, ACM Comput. Surv..

[22]  SuZhendong,et al.  Compiler validation via equivalence modulo inputs , 2014 .

[23]  Suman Jana,et al.  MoonShine: Optimizing OS Fuzzer Seed Selection with Trace Distillation , 2018, USENIX Security Symposium.

[24]  Zhendong Su,et al.  Finding compiler bugs via live code mutation , 2016, OOPSLA.

[25]  Corina S. Pasareanu,et al.  Badger: complexity analysis with fuzzing and symbolic execution , 2018, ISSTA.

[26]  Zhiqiang Lin,et al.  IoTFuzzer: Discovering Memory Corruptions in IoT Through App-based Fuzzing , 2018, NDSS.

[27]  Alastair F. Donaldson,et al.  Many-core compiler fuzzing , 2015, PLDI.

[28]  Stephen McCamant,et al.  Statically-directed dynamic automated test generation , 2011, ISSTA '11.

[29]  Xuejun Yang,et al.  Finding and understanding bugs in C compilers , 2011, PLDI '11.

[30]  Herbert Bos,et al.  IFuzzer: An Evolutionary Interpreter Fuzzer Using Genetic Programming , 2016, ESORICS.

[31]  Guofei Gu,et al.  TaintScope: A Checksum-Aware Directed Fuzzing Tool for Automatic Software Vulnerability Detection , 2010, 2010 IEEE Symposium on Security and Privacy.

[32]  Abhik Roychoudhury,et al.  Directed Greybox Fuzzing , 2017, CCS.

[33]  Jared Roesch,et al.  Language fuzzing using constraint logic programming , 2014, ASE.

[34]  Chao Zhang,et al.  CollAFL: Path Sensitive Fuzzing , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[35]  Yang Liu,et al.  FOT: a versatile, configurable, extensible fuzzing framework , 2018, ESEC/SIGSOFT FSE.

[36]  Koushik Sen,et al.  FairFuzz: A Targeted Mutation Strategy for Increasing Greybox Fuzz Testing Coverage , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[37]  Sebastian Schinzel,et al.  kAFL: Hardware-Assisted Feedback Fuzzing for OS Kernels , 2017, USENIX Security Symposium.

[38]  Andreas Zeller,et al.  Mining input grammars from dynamic taints , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[39]  Allen D. Householder,et al.  Probability-Based Parameter Selection for Black-Box Fuzz Testing , 2012 .

[40]  Yang Liu,et al.  Guided, stochastic model-based GUI testing of Android apps , 2017, ESEC/SIGSOFT FSE.

[41]  Chris Cummins,et al.  Compiler fuzzing through deep learning , 2018, ISSTA.

[42]  Michael Pradel,et al.  Learning to Fuzz: Application-Independent Fuzz Testing with Probabilistic, Generative Models of Input Data , 2016 .

[43]  Robert Guo MongoDB's JavaScript fuzzer , 2017, Commun. ACM.

[44]  William K. Robertson,et al.  LAVA: Large-Scale Automated Vulnerability Addition , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[45]  Terence Parr,et al.  The Definitive ANTLR 4 Reference , 2013 .

[46]  Herbert Bos,et al.  VUzzer: Application-aware Evolutionary Fuzzing , 2017, NDSS.

[47]  Alex Groce,et al.  Taming compiler fuzzers , 2013, ACM-SIGPLAN Symposium on Programming Language Design and Implementation.

[48]  Sang Kil Cha,et al.  IMF: Inferred Model-based Fuzzer , 2017, CCS.

[49]  Le Song,et al.  Neural Network-based Graph Embedding for Cross-Platform Binary Code Similarity Detection , 2018 .

[50]  Zhendong Su,et al.  Finding deep compiler bugs via guided stochastic program mutation , 2015, OOPSLA.

[51]  Christopher Krügel,et al.  DIFUZE: Interface Aware Fuzzing for Kernel Drivers , 2017, CCS.

[52]  Dawn Xiaodong Song,et al.  PerfFuzz: automatically generating pathological inputs , 2018, ISSTA.

[53]  Wen Xu,et al.  Designing New Operating Primitives to Improve Fuzzing Performance , 2017, CCS.

[54]  Andreas Zeller,et al.  Fuzzing with Code Fragments , 2012, USENIX Security Symposium.

[55]  Adam Kiezun,et al.  Grammar-based whitebox fuzzing , 2008, PLDI '08.

[56]  Herbert Bos,et al.  Dowsing for Overflows: A Guided Fuzzer to Find Buffer Boundary Violations , 2013, USENIX Security Symposium.

[57]  Lei Ma,et al.  DeepGauge: Multi-Granularity Testing Criteria for Deep Learning Systems , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[58]  Michael Pradel,et al.  Saying ‘Hi!’ is not enough: Mining inputs for effective test generation , 2017, 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[59]  Ye Liu,et al.  ContractFuzzer: Fuzzing Smart Contracts for Vulnerability Detection , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[60]  Abhik Roychoudhury,et al.  Coverage-Based Greybox Fuzzing as Markov Chain , 2016, IEEE Transactions on Software Engineering.

[61]  David Brumley,et al.  Program-Adaptive Mutational Fuzzing , 2015, 2015 IEEE Symposium on Security and Privacy.

[62]  Alexander Aiken,et al.  Synthesizing program input grammars , 2016, PLDI.

[63]  Abhik Roychoudhury,et al.  Model-based whitebox fuzzing for program binaries , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[64]  Barton P. Miller,et al.  An empirical study of the reliability of UNIX utilities , 1990, Commun. ACM.

[65]  David Brumley,et al.  Optimizing Seed Selection for Fuzzing , 2014, USENIX Security Symposium.

[66]  ChenYang,et al.  Finding and understanding bugs in C compilers , 2011 .