CollAFL: Path Sensitive Fuzzing

Coverage-guided fuzzing is a widely used and effective solution to find software vulnerabilities. Tracking code coverage and utilizing it to guide fuzzing are crucial to coverage-guided fuzzers. However, tracking full and accurate path coverage is infeasible in practice due to the high instrumentation overhead. Popular fuzzers (e.g., AFL) often use coarse coverage information, e.g., edge hit counts stored in a compact bitmap, to achieve highly efficient greybox testing. Such inaccuracy and incompleteness in coverage introduce serious limitations to fuzzers. First, it causes path collisions, which prevent fuzzers from discovering potential paths that lead to new crashes. More importantly, it prevents fuzzers from making wise decisions on fuzzing strategies. In this paper, we propose a coverage sensitive fuzzing solution CollAFL. It mitigates path collisions by providing more accurate coverage information, while still preserving low instrumentation overhead. It also utilizes the coverage information to apply three new fuzzing strategies, promoting the speed of discovering new paths and vulnerabilities. We implemented a prototype of CollAFL based on the popular fuzzer AFL and evaluated it on 24 popular applications. The results showed that path collisions are common, i.e., up to 75% of edges could collide with others in some applications, and CollAFL could reduce the edge collision ratio to nearly zero. Moreover, armed with the three fuzzing strategies, CollAFL outperforms AFL in terms of both code coverage and vulnerability discovery. On average, CollAFL covered 20% more program paths, found 320% more unique crashes and 260% more bugs than AFL in 200 hours. In total, CollAFL found 157 new security bugs with 95 new CVEs assigned.

[1]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.

[2]  Kosta Serebryany,et al.  Continuous Fuzzing with libFuzzer and AddressSanitizer , 2016, 2016 IEEE Cybersecurity Development (SecDev).

[3]  Rishabh Singh,et al.  Learn&Fuzz: Machine learning for input fuzzing , 2017, 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[4]  Rishabh Singh,et al.  Not all bytes are equal: Neural byte sieve for fuzzing , 2017, ArXiv.

[5]  Abhik Roychoudhury,et al.  Coverage-Based Greybox Fuzzing as Markov Chain , 2017, IEEE Trans. Software Eng..

[6]  Herbert Bos,et al.  VUzzer: Application-aware Evolutionary Fuzzing , 2017, NDSS.

[7]  Sang Kil Cha,et al.  IMF: Inferred Model-based Fuzzer , 2017, CCS.

[8]  Song Wang,et al.  QTEP: quality-aware test case prioritization , 2017, ESEC/SIGSOFT FSE.

[9]  Wen Xu,et al.  Designing New Operating Primitives to Improve Fuzzing Performance , 2017, CCS.

[10]  Allen D. Householder,et al.  Probability-Based Parameter Selection for Black-Box Fuzz Testing , 2012 .

[11]  William K. Robertson,et al.  LAVA: Large-Scale Automated Vulnerability Addition , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[12]  Hovav Shacham,et al.  The geometry of innocent flesh on the bone: return-into-libc without function calls (on the x86) , 2007, CCS '07.

[13]  Christopher Krügel,et al.  DIFUZE: Interface Aware Fuzzing for Kernel Drivers , 2017, CCS.

[14]  Angelos D. Keromytis,et al.  SlowFuzz: Automated Domain-Independent Detection of Algorithmic Complexity Vulnerabilities , 2017, CCS.

[15]  Hovav Shacham,et al.  On the effectiveness of address-space randomization , 2004, CCS '04.

[16]  Kostya Serebryany,et al.  OSS-Fuzz - Google's continuous fuzzing service for open source software , 2017 .

[17]  Mark Raugas,et al.  Faster Fuzzing: Reinitialization with Deep Neural Models , 2017, ArXiv.

[18]  Wenke Lee,et al.  Type Casting Verification: Stopping an Emerging Attack Vector , 2015, USENIX Security Symposium.

[19]  Adam Kiezun,et al.  Grammar-based whitebox fuzzing , 2008, PLDI '08.

[20]  Michael Franz,et al.  Venerable Variadic Vulnerabilities Vanquished , 2017, USENIX Security Symposium.

[21]  Hovav Shacham,et al.  Return-oriented programming without returns , 2010, CCS '10.

[22]  Guofei Gu,et al.  TaintScope: A Checksum-Aware Directed Fuzzing Tool for Automatic Software Vulnerability Detection , 2010, 2010 IEEE Symposium on Security and Privacy.

[23]  Abhik Roychoudhury,et al.  Directed Greybox Fuzzing , 2017, CCS.

[24]  Sebastian Schinzel,et al.  kAFL: Hardware-Assisted Feedback Fuzzing for OS Kernels , 2017, USENIX Security Symposium.

[25]  Xiangyu Zhang,et al.  Convicting exploitable software vulnerabilities: An efficient input provenance based approach , 2008, 2008 IEEE International Conference on Dependable Systems and Networks With FTCS and DCC (DSN).

[26]  Marc Girault,et al.  A Generalized Birthday Attack , 1988, EUROCRYPT.

[27]  Konstantin Serebryany,et al.  MemorySanitizer: Fast detector of uninitialized memory use in C++ , 2015, 2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).

[28]  Konstantin Serebryany,et al.  ThreadSanitizer: data race detection in practice , 2009, WBIA '09.

[29]  Derek Bruening,et al.  AddressSanitizer: A Fast Address Sanity Checker , 2012, USENIX Annual Technical Conference.

[30]  Yang Liu,et al.  Skyfire: Data-Driven Seed Generation for Fuzzing , 2017, 2017 IEEE Symposium on Security and Privacy (SP).