Scaffle: bug localization on millions of files

Despite all efforts to avoid bugs, software sometimes crashes in the field, leaving crash traces as the only information to localize the problem. Prior approaches on localizing where to fix the root cause of a crash do not scale well to ultra-large scale, heterogeneous code bases that contain millions of code files written in multiple programming languages. This paper presents Scaffle, the first scalable bug localization technique, which is based on the key insight to divide the problem into two easier sub-problems. First, a trained machine learning model predicts which lines of a raw crash trace are most informative for localizing the bug. Then, these lines are fed to an information retrieval-based search engine to retrieve file paths in the code base, predicting which file to change to address the crash. The approach does not make any assumptions about the format of a crash trace or the language that produces it. We evaluate Scaffle with tens of thousands of crash traces produced by a large-scale industrial code base at Facebook that contains millions of possible bug locations and that powers tools used by billions of people. The results show that the approach correctly predicts the file to fix for 40% to 60% (50% to 70%) of all crash traces within the top-1 (top-5) predictions. Moreover, Scaffle improves over several baseline approaches, including an existing classification-based approach, a scalable variant of existing information retrieval-based approaches, and a set of hand-tuned, industrially deployed heuristics.

[1]  Dongmei Zhang,et al.  ReBucket: A method for clustering duplicate crash reports based on call stack similarity , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[2]  Anh Tuan Nguyen,et al.  Combining Deep Learning with Information Retrieval to Localize Buggy Files for Bug Reports (N) , 2015, 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[3]  Sooyong Park,et al.  Which Crashes Should I Fix First?: Predicting Top Crashes at an Early Stage to Prioritize Debugging Efforts , 2011, IEEE Transactions on Software Engineering.

[4]  Rui Abreu,et al.  A Survey on Software Fault Localization , 2016, IEEE Transactions on Software Engineering.

[5]  Foutse Khomh,et al.  An Entropy Evaluation Approach for Triaging Field Crashes: A Case Study of Mozilla Firefox , 2011, 2011 18th Working Conference on Reverse Engineering.

[6]  Claire Le Goues,et al.  Semantic Crash Bucketing , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[7]  Lu Zhang,et al.  Boosting Bug-Report-Oriented Fault Localization with Segmentation and Stack-Trace Analysis , 2014, 2014 IEEE International Conference on Software Maintenance and Evolution.

[8]  David Broman,et al.  Automated bug assignment: Ensemble-based machine learning in large scale industrial contexts , 2016, Empirical Software Engineering.

[9]  Alessandro Orso,et al.  Evaluating the usefulness of IR-based fault localization techniques , 2015, ISSTA.

[10]  Shaohua Wang,et al.  Improving bug localization using correlations in crash reports , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[11]  A.J.C. van Gemund,et al.  On the Accuracy of Spectrum-based Fault Localization , 2007, Testing: Academic and Industrial Conference Practice and Research Techniques - MUTATION (TAICPART-MUTATION 2007).

[12]  Koushik Sen,et al.  Retrieval on source code: a neural code search , 2018, MAPL@PLDI.

[13]  Jacques Klein,et al.  D&C: A Divide-and-Conquer Approach to IR-based Bug Localization , 2019, ArXiv.

[14]  Chanchal Kumar Roy,et al.  Improving IR-based bug localization with context-aware query reformulation , 2018, ESEC/SIGSOFT FSE.

[15]  Ming Wen,et al.  ChangeLocator: locate crash-inducing changes based on crash reports , 2017, Empirical Software Engineering.

[16]  Foutse Khomh,et al.  Classifying field crash reports for fixing bugs: A case study of Mozilla Firefox , 2011, 2011 27th IEEE International Conference on Software Maintenance (ICSM).

[17]  Ming Wen,et al.  Locus: Locating bugs from software changes , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[18]  Claire Le Goues,et al.  Automated program repair , 2019, Commun. ACM.

[19]  Andrian Marcus,et al.  On the Use of Stack Traces to Improve Text Retrieval-Based Bug Localization , 2014, 2014 IEEE International Conference on Software Maintenance and Evolution.

[20]  Razvan C. Bunescu,et al.  Learning to rank relevant files for bug reports using domain knowledge , 2014, SIGSOFT FSE.

[21]  Galen C. Hunt,et al.  Debugging in the (very) large: ten years of implementation and experience , 2009, SOSP '09.

[22]  Ranjita Bhagwan,et al.  Orca: Differential Bug Localization in Large-Scale Services , 2018, OSDI.

[23]  Jian Zhou,et al.  Where should the bugs be fixed? More accurate information retrieval-based bug localization based on bug reports , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[24]  Sarfraz Khurshid,et al.  Improving bug localization using structured information retrieval , 2013, 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[25]  Luisa Verdoliva,et al.  Automatically analyzing groups of crashes for finding correlations , 2017, ESEC/SIGSOFT FSE.

[26]  Hung Viet Nguyen,et al.  A topic-based approach for narrowing the search space of buggy files from a bug report , 2011, 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011).

[27]  Razvan C. Bunescu,et al.  Mapping Bug Reports to Relevant Files: A Ranking Model, a Fine-Grained Benchmark, and Feature Evaluation , 2016, IEEE Transactions on Software Engineering.

[28]  Rongxin Wu,et al.  CrashLocator: locating crashing faults based on crash stacks , 2014, ISSTA 2014.

[29]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[30]  Arie van Deursen,et al.  A guided genetic algorithm for automated crash reproduction , 2017, ICSE 2017.

[31]  John T. Stasko,et al.  Visualization of test information to assist fault localization , 2002, ICSE '02.

[32]  Andreas Zeller,et al.  Where Should We Fix This Bug? A Two-Phase Recommendation Model , 2013, IEEE Transactions on Software Engineering.

[33]  Alessandro Orso,et al.  Enlightened Debugging , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).