Semantic Crash Bucketing

Precise crash triage is important for automated dynamic testing tools, like fuzzers. At scale, fuzzers produce millions of crashing inputs. Fuzzers use heuristics, like stack hashes, to cut down on duplicate bug reports. These heuristics are fast, but often imprecise: even after deduplication, hundreds of uniquely reported crashes can still correspond to the same bug. Remaining crashes must be inspected manually, incurring considerable effort. In this paper we present Semantic Crash Bucketing, a generic method for precise crash bucketing using program transformation. Semantic Crash Bucketing maps crashing inputs to unique bugs as a function of changing a program (i.e., a semantic delta). We observe that a real bug fix precisely identifies crashes belonging to the same bug. Our insight is to approximate real bug fixes with lightweight program transformation to obtain the same level of precision. Our approach uses (a) patch templates and (b) semantic feedback from the program to automatically generate and apply approximate fixes for general bug classes. Our evaluation shows that approximate fixes are competitive with using true fixes for crash bucketing, and significantly outperforms built-in deduplication techniques for three state of the art fuzzers.

[1]  Abhik Roychoudhury,et al.  Angelix: Scalable Multiline Program Patch Synthesis via Symbolic Analysis , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[2]  Patrice Godefroid,et al.  Automatic partial loop summarization in dynamic test generation , 2011, ISSTA '11.

[3]  Somesh Jha,et al.  Buffer overrun detection using linear programming and static analysis , 2003, CCS '03.

[4]  Alex Groce,et al.  Generating focused random tests using directed swarm testing , 2016, ISSTA.

[5]  David A. Wagner,et al.  Dynamic Test Generation to Find Integer Bugs in x86 Binary Linux Programs , 2009, USENIX Security Symposium.

[6]  Alex Groce,et al.  Taming compiler fuzzers , 2013, ACM-SIGPLAN Symposium on Programming Language Design and Implementation.

[7]  Emina Torlak,et al.  Angelic debugging , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[8]  Martin Monperrus,et al.  Nopol: Automatic Repair of Conditional Statement Bugs in Java Programs , 2018, IEEE Transactions on Software Engineering.

[9]  David Evans,et al.  Statically Detecting Likely Buffer Overflow Vulnerabilities , 2001, USENIX Security Symposium.

[10]  Westley Weimer,et al.  Patches as better bug reports , 2006, GPCE '06.

[11]  Claire Le Goues,et al.  A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[12]  Mathias Payer,et al.  T-Fuzz: Fuzzing by Program Transformation , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[13]  Jaechang Nam,et al.  Automatic patch generation learned from human-written patches , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[14]  Andreas Zeller,et al.  Locating causes of program failures , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[15]  David Melski,et al.  Data-Delineation in Software Binaries and its Application to Buffer-Overrun Discovery , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[16]  David Brumley,et al.  Enhancing symbolic execution with veritesting , 2014, ICSE.

[17]  David Brumley,et al.  Scheduling black-box mutational fuzzing , 2013, CCS.

[18]  Steven P. Reiss,et al.  Fault localization with nearest neighbor queries , 2003, 18th IEEE International Conference on Automated Software Engineering, 2003. Proceedings..

[19]  Xuxian Jiang,et al.  AutoPaG: towards automated software patch generation with source code root cause identification and repair , 2007, ASIACCS '07.

[20]  Yanick Fratantonio,et al.  RETracer: Triaging Crashes by Reverse Execution from Partial Memory Dumps , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[21]  Zhe Yang,et al.  Modular checking for buffer overflows in the large , 2006, ICSE.

[22]  Alex Groce,et al.  Swarm testing , 2012, ISSTA 2012.

[23]  Shuvendu K. Lahiri,et al.  Automatic Rootcausing for Program Equivalence Failures in Binaries , 2015, CAV.

[24]  Zack Coker,et al.  Program transformations to fix C integers , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[25]  Abhik Roychoudhury,et al.  Bucketing Failing Tests via Symbolic Analysis , 2017, FASE.

[26]  Alex Groce,et al.  The Theory of Composite Faults , 2017, 2017 IEEE International Conference on Software Testing, Verification and Validation (ICST).

[27]  Kostya Serebryany,et al.  OSS-Fuzz - Google's continuous fuzzing service for open source software , 2017 .

[28]  Michael I. Jordan,et al.  Scalable statistical bug isolation , 2005, PLDI '05.

[29]  Bohn Stafleu van Loghum,et al.  Online … , 2002, LOG IN.

[30]  David Lo,et al.  A Deeper Look into Bug Fixes: Patterns, Replacements, Deletions, and Additions , 2016, 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR).

[31]  Claire Le Goues,et al.  Current challenges in automatic software repair , 2013, Software Quality Journal.

[32]  David Brumley,et al.  Optimizing Seed Selection for Fuzzing , 2014, USENIX Security Symposium.

[33]  Daniel M. Roy,et al.  Enhancing Server Availability and Security Through Failure-Oblivious Computing , 2004, OSDI.

[34]  Vern Paxson,et al.  A Large-Scale Empirical Study of Security Patches , 2017, CCS.

[35]  Dawei Qi,et al.  SemFix: Program repair via semantic analysis , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[36]  Claudia Eckert,et al.  Automated Generation of Buffer Overflow Quick Fixes Using Symbolic Execution and SMT , 2015, SAFECOMP.

[37]  Dongmei Zhang,et al.  ReBucket: A method for clustering duplicate crash reports based on call stack similarity , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[38]  Fan Long,et al.  Automatic patch generation by learning correct code , 2016, POPL.

[39]  Fan Long,et al.  Automatic runtime error repair and containment via recovery shepherding , 2014, PLDI.

[40]  Mary Jean Harrold,et al.  Empirical evaluation of the tarantula automatic fault-localization technique , 2005, ASE.