Not All Coverage Measurements Are Equal: Fuzzing by Coverage Accounting for Input Prioritization

Coverage-based fuzzing has been actively studied and widely adopted for finding vulnerabilities in real-world software applications. With coverage information, such as statement coverage and transition coverage, as the guidance of input mutation, coverage-based fuzzing can generate inputs that cover more code and thus find more vulnerabilities without prerequisite information such as input format. Current coveragebased fuzzing tools treat covered code equally. All inputs that contribute to new statements or transitions are kept for future mutation no matter what the statements or transitions are and how much they impact security. Although this design is reasonable from the perspective of software testing that aims at full code coverage, it is inefficient for vulnerability discovery since that 1) current techniques are still inadequate to reach full coverage within a reasonable amount of time, and that 2) we always want to discover vulnerabilities early so that it can be fixed promptly. Even worse, due to the non-discriminative code coverage treatment, current fuzzing tools suffer from recent anti-fuzzing techniques and become much less effective in finding vulnerabilities from programs enabled with anti-fuzzing schemes. To address the limitation caused by equal coverage, we propose coverage accounting, a novel approach that evaluates coverage by security impacts. Coverage accounting attributes edges by three metrics based on three different levels: function, loop and basic block. Based on the proposed metrics, we design a new scheme to prioritize fuzzing inputs and develop TortoiseFuzz, a greybox fuzzer for finding memory corruption vulnerabilities. We evaluated TortoiseFuzz on 30 real-world applications and compared it with 6 state-of-the-art greybox and hybrid fuzzers: AFL, AFLFast, FairFuzz, MOPT, QSYM, and Angora. Statistically, TortoiseFuzz found more vulnerabilities than 5 out of 6 fuzzers (AFL, AFLFast, FairFuzz, MOPT, and Angora), and it had a comparable result to QSYM yet only consumed around 2% of QSYM’s memory usage on average. We also compared coverage accounting metrics with two other metrics, AFL-Sensitive and LEOPARD, and TortoiseFuzz performed significantly better than both metrics in finding vulnerabilities. Furthermore, we applied the coverage accounting metrics to QSYM and noticed that coverage accounting helps increase the number of discovered vulnerabilities by 28.6% on average. TortoiseFuzz found 20 zero-day vulnerabilities with 15 confirmed with CVE identifications.

[1]  Choongwoo Han,et al.  Fuzzing: Art, Science, and Engineering , 2018, ArXiv.

[2]  Meng Xu,et al.  QSYM : A Practical Concolic Execution Engine Tailored for Hybrid Fuzzing , 2018, USENIX Security Symposium.

[3]  Thorsten Holz,et al.  AntiFuzz: Impeding Fuzzing Audits of Binary Executables , 2019, USENIX Security Symposium.

[4]  Zhiqiang Lin,et al.  IoTFuzzer: Discovering Memory Corruptions in IoT Through App-based Fuzzing , 2018, NDSS.

[5]  Guofei Gu,et al.  TaintScope: A Checksum-Aware Directed Fuzzing Tool for Automatic Software Vulnerability Detection , 2010, 2010 IEEE Symposium on Security and Privacy.

[6]  Taesoo Kim,et al.  Fuzzing File Systems via Two-Dimensional Input Space Exploration , 2019, 2019 IEEE Symposium on Security and Privacy (SP).

[7]  Abhik Roychoudhury,et al.  Directed Greybox Fuzzing , 2017, CCS.

[8]  Kevin C. Almeroth,et al.  SNOOZE: Toward a Stateful NetwOrk prOtocol fuzZEr , 2006, ISC.

[9]  Chao Zhang,et al.  CollAFL: Path Sensitive Fuzzing , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[10]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[11]  Koushik Sen,et al.  FairFuzz: A Targeted Mutation Strategy for Increasing Greybox Fuzz Testing Coverage , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[12]  David Brumley,et al.  Program-Adaptive Mutational Fuzzing , 2015, 2015 IEEE Symposium on Security and Privacy.

[13]  Taesoo Kim,et al.  Fuzzification: Anti-Fuzzing Techniques , 2019, USENIX Security Symposium.

[14]  Sebastian Schinzel,et al.  kAFL: Hardware-Assisted Feedback Fuzzing for OS Kernels , 2017, USENIX Security Symposium.

[15]  Yu Jiang,et al.  LEOPARD: Identifying Vulnerable Code for Vulnerability Assessment Through Program Metrics , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE).

[16]  Chen Chen,et al.  A systematic review of fuzzing techniques , 2018, Comput. Secur..

[17]  Wen Xu,et al.  Designing New Operating Primitives to Improve Fuzzing Performance , 2017, CCS.

[18]  Yang Liu,et al.  Steelix: program-state based binary fuzzing , 2017, ESEC/SIGSOFT FSE.

[19]  Matthew Smith,et al.  VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assist Code Audits , 2015, CCS.

[20]  Heng Yin,et al.  Scalable Graph-based Bug Search for Firmware Images , 2016, CCS.

[21]  Herbert Bos,et al.  VUzzer: Application-aware Evolutionary Fuzzing , 2017, NDSS.

[22]  Andreas Zeller,et al.  Predicting vulnerable software components , 2007, CCS '07.

[23]  Xu Zhou,et al.  PTfuzz: Guided Fuzzing With Processor Trace Feedback , 2018, IEEE Access.

[24]  Christopher Krügel,et al.  DIFUZE: Interface Aware Fuzzing for Kernel Drivers , 2017, CCS.

[25]  Christopher Krügel,et al.  Driller: Augmenting Fuzzing Through Selective Symbolic Execution , 2016, NDSS.

[26]  Song Wang,et al.  QTEP: quality-aware test case prioritization , 2017, ESEC/SIGSOFT FSE.

[27]  Yang Liu,et al.  Skyfire: Data-Driven Seed Generation for Fuzzing , 2017, 2017 IEEE Symposium on Security and Privacy (SP).

[28]  Thorsten Holz,et al.  REDQUEEN: Fuzzing with Input-to-State Correspondence , 2019, NDSS.

[29]  Dawson R. Engler,et al.  KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs , 2008, OSDI.

[30]  Herbert Bos,et al.  TIFF: Using Input Type Inference To Improve Fuzzing , 2018, ACSAC.

[31]  Abhik Roychoudhury,et al.  Coverage-Based Greybox Fuzzing as Markov Chain , 2016, IEEE Transactions on Software Engineering.

[32]  Chengyu Song,et al.  Be Sensitive and Collaborative: Analyzing Impact of Coverage Metrics in Greybox Fuzzing , 2019, RAID.

[33]  Patrice Godefroid,et al.  SAGE: Whitebox Fuzzing for Security Testing , 2012, ACM Queue.

[34]  Joeri de Ruiter,et al.  Protocol State Fuzzing of TLS Implementations , 2015, USENIX Security Symposium.

[35]  Ryan Cunningham,et al.  Automated Vulnerability Analysis: Leveraging Control Flow for Evolutionary Input Crafting , 2007, Twenty-Third Annual Computer Security Applications Conference (ACSAC 2007).

[36]  Hao Chen,et al.  Angora: Efficient Fuzzing by Principled Search , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[37]  Aurélien Francillon,et al.  What You Corrupt Is Not What You Crash: Challenges in Fuzzing Embedded Devices , 2018, NDSS.

[38]  Herbert Bos,et al.  Dowsing for Overflows: A Guided Fuzzer to Find Buffer Boundary Violations , 2013, USENIX Security Symposium.

[39]  Xiangyu Zhang,et al.  ProFuzzer: On-the-fly Input Type Probing for Better Zero-Day Vulnerability Discovery , 2019, 2019 IEEE Symposium on Security and Privacy (SP).

[40]  Wouter Joosen,et al.  Predicting Vulnerable Software Components via Text Mining , 2014, IEEE Transactions on Software Engineering.

[41]  Martin C. Rinard,et al.  Taint-based directed whitebox fuzzing , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[42]  Junfeng Yang,et al.  NEUZZ: Efficient Fuzzing with Neural Program Smoothing , 2018, 2019 IEEE Symposium on Security and Privacy (SP).

[43]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[44]  Shouhuai Xu,et al.  VulPecker: an automated vulnerability detection system based on code similarity analysis , 2016, ACSAC.

[45]  Abhik Roychoudhury,et al.  Model-based whitebox fuzzing for program binaries , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[46]  Chao Zhang,et al.  Fuzzing: a survey , 2018, Cybersecur..

[47]  Derek Bruening,et al.  AddressSanitizer: A Fast Address Sanity Checker , 2012, USENIX Annual Technical Conference.

[48]  Barton P. Miller,et al.  An empirical study of the reliability of UNIX utilities , 1990, Commun. ACM.

[49]  Rishabh Singh,et al.  Learn&Fuzz: Machine learning for input fuzzing , 2017, 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[50]  Shouhuai Xu,et al.  VulDeePecker: A Deep Learning-Based System for Vulnerability Detection , 2018, NDSS.

[51]  Andrew Ruef,et al.  Evaluating Fuzz Testing , 2018, CCS.

[52]  Chao Zhang,et al.  MOPT: Optimized Mutation Scheduling for Fuzzers , 2019, USENIX Security Symposium.

[53]  Yi Yang,et al.  Towards Efficient Heap Overflow Discovery , 2017, USENIX Security Symposium.

[54]  Kosta Serebryany,et al.  Continuous Fuzzing with libFuzzer and AddressSanitizer , 2016, 2016 IEEE Cybersecurity Development (SecDev).