论文信息 - Does the fault reside in a stack trace? Assisting crash localization by predicting crashing fault residence

Does the fault reside in a stack trace? Assisting crash localization by predicting crashing fault residence

Abstract Given a stack trace reported at the time of software crash, crash localization aims to pinpoint the root cause of the crash. Crash localization is known as a time-consuming and labor-intensive task. Without tool support, developers have to spend tedious manual effort examining a large amount of source code based on their experience. In this paper, we propose an automatic approach, namely CraTer, which predicts whether a crashing fault resides in stack traces or not (referred to as predicting crashing fault residence). We extract 89 features from stack traces and source code to train a predictive model based on known crashes. We then use the model to predict the residence of newly-submitted crashes. CraTer can reduce the search space for crashing faults and help prioritize crash localization efforts. Experimental results on crashes of seven real-world projects demonstrate that CraTer can achieve an average accuracy of over 92%.

[1] Mary Jean Harrold,et al. Empirical evaluation of the tarantula automatic fault-localization technique , 2005, ASE.

[2] Hareton K. N. Leung,et al. Understanding the API usage in Java , 2016, Inf. Softw. Technol..

[3] Rahul Premraj,et al. Do stack traces help developers fix bugs? , 2010, 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010).

[4] A.J.C. van Gemund,et al. On the Accuracy of Spectrum-based Fault Localization , 2007, Testing: Academic and Industrial Conference Practice and Research Techniques - MUTATION (TAICPART-MUTATION 2007).

[5] Sarfraz Khurshid,et al. Injecting mechanical faults to localize developer faults for evolving software , 2013, OOPSLA.

[6] J. Ross Quinlan,et al. C4.5: Programs for Machine Learning , 1992 .

[7] Laurie A. Williams,et al. Approximating Attack Surfaces with Stack Traces , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[8] Matias Martinez,et al. B-Refactoring: Automatic test code refactoring to improve dynamic analysis , 2016, Information and Software Technology.

[9] Wei Li,et al. Fault Localization for Null Pointer Exception Based on Stack Trace and Program Slicing , 2012, 2012 12th International Conference on Quality Software.

[10] Lu Zhang,et al. Boosting Bug-Report-Oriented Fault Localization with Segmentation and Stack-Trace Analysis , 2014, 2014 IEEE International Conference on Software Maintenance and Evolution.

[11] Renaud Pawlak,et al. SPOON: A library for implementing analyses and transformations of Java source code , 2016, Softw. Pract. Exp..

[12] Isabelle Guyon,et al. An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[13] Liang Gong,et al. Locating Crashing Faults based on Crash Stack Traces , 2014, ArXiv.

[14] Annibale Panichella,et al. Evolutionary testing for crash reproduction , 2016 .

[15] James H. Andrews,et al. Evaluating the Accuracy of Fault Localization Techniques , 2009, 2009 IEEE/ACM International Conference on Automated Software Engineering.

[16] Shi Ying,et al. EH-Recommender: Recommending Exception Handling Strategies Based on Program Context , 2018, 2018 23rd International Conference on Engineering of Complex Computer Systems (ICECCS).

[17] Andreas Zeller,et al. Reconstructing Core Dumps , 2013, 2013 IEEE Sixth International Conference on Software Testing, Verification and Validation.

[18] Martin Monperrus,et al. Nopol: Automatic Repair of Conditional Statement Bugs in Java Programs , 2018, IEEE Transactions on Software Engineering.

[19] Michael D. Ernst,et al. Defects4J: a database of existing faults to enable controlled testing studies for Java programs , 2014, ISSTA 2014.

[20] Nachiappan Nagappan,et al. Crash graphs: An aggregated view of multiple crashes to improve crash triage , 2011, 2011 IEEE/IFIP 41st International Conference on Dependable Systems & Networks (DSN).

[21] Arie van Deursen,et al. A guided genetic algorithm for automated crash reproduction , 2017, ICSE 2017.

[22] Dimitris Mitropoulos,et al. Charting the API minefield using software telemetry data , 2014, Empirical Software Engineering.

[23] Shin Yoo,et al. Ask the Mutants: Mutating Faulty Programs for Fault Localization , 2014, 2014 IEEE Seventh International Conference on Software Testing, Verification and Validation.

[24] Rongxin Wu,et al. CrashLocator: locating crashing faults based on crash stacks , 2014, ISSTA 2014.

[25] Martin Monperrus,et al. Crash reproduction via test case mutation: let existing test cases help , 2015, ESEC/SIGSOFT FSE.

[26] Ian H. Witten,et al. The WEKA data mining software: an update , 2009, SKDD.

[27] Jian Zhou,et al. Where should the bugs be fixed? More accurate information retrieval-based bug localization based on bug reports , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[28] David Lo,et al. Automatic, high accuracy prediction of reopened bugs , 2014, Automated Software Engineering.

[29] Gang Wang,et al. Feature selection with conditional mutual information maximin in text categorization , 2004, CIKM '04.

[30] Yongfeng Gu,et al. Automatic Reproducible Crash Detection , 2016, 2016 International Conference on Software Analysis, Testing and Evolution (SATE).

[31] Nitesh V. Chawla,et al. SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[32] Charles Elkan,et al. The Foundations of Cost-Sensitive Learning , 2001, IJCAI.

[33] David Lo,et al. EnTagRec: An Enhanced Tag Recommendation System for Software Information Sites , 2014, ICSME.

[34] David Lo,et al. Will Fault Localization Work for These Failures? An Automated Approach to Predict Effectiveness of Fault Localization Tools , 2013, 2013 IEEE International Conference on Software Maintenance.

[35] Loet Leydesdorff,et al. The relation between Pearson's correlation coefficient r and Salton's cosine measure , 2009, ArXiv.

[36] David Lo,et al. Cross-language bug localization , 2014, ICPC 2014.

[37] Xin Zhang,et al. How do Multiple Pull Requests Change the Same Code: A Study of Competing Pull Requests in GitHub , 2018, 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[38] Alex Groce,et al. On The Limits of Mutation Reduction Strategies , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[39] Claes Wohlin,et al. Experimentation in Software Engineering , 2012, Springer Berlin Heidelberg.

[40] Nélio Cacho,et al. Do android developers neglect error handling? a maintenance-Centric study on the relationship between android abstractions and uncaught exceptions , 2018, J. Syst. Softw..

[41] Mickaël Delahaye,et al. A Comparison of Mutation Analysis Tools for Java , 2013, 2013 13th International Conference on Quality Software.

[42] Akbar Siami Namin,et al. The use of mutation in testing experiments and its sensitivity to external threats , 2011, ISSTA '11.

[43] Rongxin Wu,et al. Casper: an efficient approach to call trace collection , 2016, POPL.

[44] Haibo He,et al. Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[45] Lu Zhang,et al. Predictive Mutation Testing , 2016, IEEE Transactions on Software Engineering.

[46] David Lo,et al. Fusion fault localizers , 2014, ASE.

[47] David Lo,et al. Information retrieval and spectrum based bug localization: better together , 2015, ESEC/SIGSOFT FSE.

[48] Lars Grunske,et al. A learning-to-rank based fault localization approach using likely invariants , 2016, ISSTA.

[49] He Jiang,et al. Developer recommendation on bug commenting: a ranking approach for the developer crowd , 2017, Science China Information Sciences.

[50] Dongmei Zhang,et al. ReBucket: A method for clustering duplicate crash reports based on call stack similarity , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[51] Ning Chen,et al. STAR: Stack Trace Based Automatic Crash Reproduction via Symbolic Execution , 2015, IEEE Transactions on Software Engineering.

[52] Li Li,et al. Watch out for this commit! A study of influential software changes , 2016, J. Softw. Evol. Process..

[53] Igor Kononenko,et al. Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[54] Wei-Yin Loh,et al. Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..