Hawkeye: Towards a Desired Directed Grey-box Fuzzer

Grey-box fuzzing is a practically effective approach to test real-world programs. However, most existing grey-box fuzzers lack directedness, i.e. the capability of executing towards user-specified target sites in the program. To emphasize existing challenges in directed fuzzing, we propose Hawkeye to feature four desired properties of directed grey-box fuzzers. Owing to a novel static analysis on the program under test and the target sites, Hawkeye precisely collects the information such as the call graph, function and basic block level distances to the targets. During fuzzing, Hawkeye evaluates exercised seeds based on both static information and the execution traces to generate the dynamic metrics, which are then used for seed prioritization, power scheduling and adaptive mutating. These strategies help Hawkeye to achieve better directedness and gravitate towards the target sites. We implemented Hawkeye as a fuzzing framework and evaluated it on various real-world programs under different scenarios. The experimental results showed that Hawkeye can reach the target sites and reproduce the crashes much faster than state-of-the-art grey-box fuzzers such as AFL and AFLGo. Specially, Hawkeye can reduce the time to exposure for certain vulnerabilities from about 3.5 hours to 0.5 hour. By now, Hawkeye has detected more than 41 previously unknown crashes in projects such as Oniguruma, MJS with the target sites provided by vulnerability prediction tools; all these crashes are confirmed and 15 of them have been assigned CVE IDs.

[1]  Christopher Krügel,et al.  Driller: Augmenting Fuzzing Through Selective Symbolic Execution , 2016, NDSS.

[2]  Herbert Bos,et al.  VUzzer: Application-aware Evolutionary Fuzzing , 2017, NDSS.

[3]  Csaba Szepesvári,et al.  Exploration-exploitation tradeoff using variance estimates in multi-armed bandits , 2009, Theor. Comput. Sci..

[4]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[5]  Lars Ole Andersen,et al.  Program Analysis and Specialization for the C Programming Language , 2005 .

[6]  Mathias Payer,et al.  T-Fuzz: Fuzzing by Program Transformation , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[7]  Anja Feldmann,et al.  Static Program Analysis as a Fuzzing Aid , 2017, RAID.

[8]  Cristian Cadar,et al.  KATCH: high-coverage testing of software patches , 2013, ESEC/FSE 2013.

[9]  A. Vargha,et al.  A Critique and Improvement of the CL Common Language Effect Size Statistics of McGraw and Wong , 2000 .

[10]  Koushik Sen,et al.  DART: directed automated random testing , 2005, PLDI '05.

[11]  Guofei Gu,et al.  TaintScope: A Checksum-Aware Directed Fuzzing Tool for Automatic Software Vulnerability Detection , 2010, 2010 IEEE Symposium on Security and Privacy.

[12]  Abhik Roychoudhury,et al.  Directed Greybox Fuzzing , 2017, CCS.

[13]  Chao Zhang,et al.  CollAFL: Path Sensitive Fuzzing , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[14]  KrausSarit,et al.  Exploitation vs. exploration , 2004 .

[15]  Martin C. Rinard,et al.  Taint-based directed whitebox fuzzing , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[16]  Yang Liu,et al.  BinGo: cross-architecture cross-OS binary search , 2016, SIGSOFT FSE.

[17]  Hao Chen,et al.  Angora: Efficient Fuzzing by Principled Search , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[18]  Yang Liu,et al.  Accurate and Scalable Cross-Architecture Cross-OS Binary Code Search with Emulation , 2019, IEEE Transactions on Software Engineering.

[19]  Michael Hicks,et al.  Directed Symbolic Execution , 2011, SAS.

[20]  Abhik Roychoudhury,et al.  Coverage-Based Greybox Fuzzing as Markov Chain , 2016, IEEE Transactions on Software Engineering.

[21]  Koushik Sen,et al.  FairFuzz: A Targeted Mutation Strategy for Increasing Greybox Fuzz Testing Coverage , 2017, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[22]  Günter Pomaska,et al.  PHP Hypertext Preprocessor , 2012 .

[23]  Barton P. Miller,et al.  An empirical study of the reliability of UNIX utilities , 1990, Commun. ACM.

[24]  Yang Liu,et al.  FOT: a versatile, configurable, extensible fuzzing framework , 2018, ESEC/SIGSOFT FSE.

[25]  Yang Liu,et al.  Steelix: program-state based binary fuzzing , 2017, ESEC/SIGSOFT FSE.

[26]  Weiguang Wang,et al.  SeededFuzz: Selecting and Generating Seeds for Directed Fuzzing , 2016, 2016 10th International Symposium on Theoretical Aspects of Software Engineering (TASE).

[27]  Yang Liu,et al.  Skyfire: Data-Driven Seed Generation for Fuzzing , 2017, 2017 IEEE Symposium on Security and Privacy (SP).

[28]  Chen Chen,et al.  A systematic review of fuzzing techniques , 2018, Comput. Secur..

[29]  Jingling Xue,et al.  SVF: interprocedural static value-flow analysis in LLVM , 2016, CC.

[30]  Wen Xu,et al.  Designing New Operating Primitives to Improve Fuzzing Performance , 2017, CCS.

[31]  Alessandro Orso,et al.  BugRedux: Reproducing field failures for in-house debugging , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[32]  Herbert Bos,et al.  Dowsing for Overflows: A Guided Fuzzer to Find Buffer Boundary Violations , 2013, USENIX Security Symposium.