Method-level bug localization using hybrid multi-objective search

Abstract Context: One of the time-consuming maintenance tasks is the localization of bugs especially in large software systems. Developers have to follow a tedious process to reproduce the abnormal behavior then inspect a large number of files. While several studies have been proposed for bugs localization, the majority of them are recommending classes/files as outputs which may still require high inspection effort. Furthermore, there is a significant difference between the natural language used in bug reports and the programming language which limits the efficiency of existing approaches since most of them are mainly based on lexical similarity. Objective: In this paper, we propose an automated approach to find and rank the potential methods in order to localize the source of a bug based on a bug report description. Method: Our approach finds a good balance between minimizing the number of recommended classes and maximizing the relevance of the proposed solution using a hybrid multi-objective optimization algorithm combining local and global search. The relevance of the recommended code fragments is estimated based on the use of the history of changes and bug-fixing, and the lexical similarity between the bug report description and the API documentation. Our approach operates on two main steps. The first step is to find the best set of classes satisfying the two conflicting criteria of relevance and the number of classes to recommend using a global search based on NSGA-II. The second step is to locate the most appropriate methods to inspect, using a local multi-objective search based on Simulated Annealing (MOSA) from the list of classes recommended by the first step. Results: We evaluated our system on 6 open source Java projects, using the version of the project before fixing the bug of many bug reports. Our hybrid multi-objective approach is able to successfully locate the true buggy methods within the top 10 recommendations for over 78% of the bug reports leading to a significant reduction of developers’ effort comparing to class-level bug localization techniques. Conclusion: The experimental results show that the search-based approach significantly outperforms four state-of-the-art methods in recommending relevant files for bug reports.

[1]  Vipin Kumar,et al.  Introduction to Data Mining , 2022, Data Mining and Machine Learning Applications.

[2]  K. Deb,et al.  Understanding knee points in bicriteria problems and their implications as preferred solution principles , 2011 .

[3]  Razvan C. Bunescu,et al.  Learning to rank relevant files for bug reports using domain knowledge , 2014, SIGSOFT FSE.

[4]  Letha H. Etzkorn,et al.  Bug localization using latent Dirichlet allocation , 2010, Inf. Softw. Technol..

[5]  Yves Le Traon,et al.  Mutation-Based Generation of Software Product Line Test Configurations , 2014, SSBSE.

[6]  Michael I. Jordan,et al.  Bug isolation via remote program sampling , 2003, PLDI.

[7]  Yuanyuan Zhang,et al.  Search-based software engineering: Trends, techniques and applications , 2012, CSUR.

[8]  Emily Hill,et al.  Mining source code to automatically split identifiers for software analysis , 2009, 2009 6th IEEE International Working Conference on Mining Software Repositories.

[9]  André A. Keller Multi-Objective Optimization In Theory and Practice II: Metaheuristic Algorithms , 2019 .

[10]  Bart Goethals,et al.  Predicting the severity of a reported bug , 2010, 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010).

[11]  Marouane Kessentini,et al.  Regression Testing for Model Transformations: A Multi-objective Approach , 2013, SSBSE.

[12]  S. Dumais Latent Semantic Analysis. , 2005 .

[13]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[14]  Marouane Kessentini,et al.  Detecting model refactoring opportunities using heuristic search , 2011, CASCON.

[15]  Marouane Kessentini,et al.  On the Use of Machine Learning and Search-Based Software Engineering for Ill-Defined Fitness Function: A Case Study on Software Refactoring , 2014, SSBSE.

[16]  Harald C. Gall,et al.  Analyzing and relating bug report data for feature tracking , 2003, 10th Working Conference on Reverse Engineering, 2003. WCRE 2003. Proceedings..

[17]  Eunseok Lee,et al.  Improved bug localization based on code change histories and bug reports , 2017, Inf. Softw. Technol..

[18]  Michael T. M. Emmerich,et al.  A tutorial on multiobjective optimization: fundamentals and evolutionary methods , 2018, Natural Computing.

[19]  Anh Tuan Nguyen,et al.  Bug Localization with Combination of Deep Learning and Information Retrieval , 2017, 2017 IEEE/ACM 25th International Conference on Program Comprehension (ICPC).

[20]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[21]  Marouane Kessentini,et al.  Model refactoring using examples: a search‐based approach , 2014, J. Softw. Evol. Process..

[22]  David Lo,et al.  Version history, similar report, and structure: putting them together for improved bug localization , 2014, ICPC 2014.

[23]  Avinash C. Kak,et al.  Retrieval from software libraries for bug localization: a comparative study of generic and composite text models , 2011, MSR '11.

[24]  E. L. Ulungu,et al.  MOSA method: a tool for solving multiobjective combinatorial optimization problems , 1999 .

[25]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[26]  Pablo Loyola,et al.  Bug Localization by Learning to Rank and Represent Bug Inducing Changes , 2018, CIKM.

[27]  Marouane Kessentini,et al.  Detecting Android Smells Using Multi-Objective Genetic Programming , 2017, 2017 IEEE/ACM 4th International Conference on Mobile Software Engineering and Systems (MOBILESoft).

[28]  Marouane Kessentini,et al.  On the use of design defect examples to detect model refactoring opportunities , 2015, Software Quality Journal.

[29]  Hung Viet Nguyen,et al.  A topic-based approach for narrowing the search space of buggy files from a bug report , 2011, 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011).

[30]  Lu Zhang,et al.  Boosting Bug-Report-Oriented Fault Localization with Segmentation and Stack-Trace Analysis , 2014, 2014 IEEE International Conference on Software Maintenance and Evolution.

[31]  Jian Zhou,et al.  Where should the bugs be fixed? More accurate information retrieval-based bug localization based on bug reports , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[32]  Ming Wen,et al.  Locus: Locating bugs from software changes , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[33]  Marouane Kessentini,et al.  Article in Press G Model the Journal of Systems and Software Search-based Metamodel Matching with Structural and Syntactic Measures , 2022 .

[34]  Robert M. Hierons,et al.  Using genetic algorithms to generate test sequences for complex timed systems , 2013, Soft Comput..

[35]  Siau-Cheng Khoo,et al.  A discriminative model approach for accurate duplicate bug report retrieval , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[36]  Sousuke Amasaki,et al.  Empirical study of abnormality in local variables and its application to fault‐prone Java method analysis , 2020, J. Softw. Evol. Process..

[37]  David Lo,et al.  Which Packages Would be Affected by This Bug Report? , 2017, 2017 IEEE 28th International Symposium on Software Reliability Engineering (ISSRE).

[38]  Sriram K. Rajamani,et al.  DebugAdvisor: a recommender system for debugging , 2009, ESEC/FSE '09.

[39]  Thomas Zimmermann,et al.  What Makes a Good Bug Report? , 2010, IEEE Trans. Software Eng..

[40]  Yan Xiao,et al.  Machine translation-based bug localization technique for bridging lexical gap , 2018, Inf. Softw. Technol..

[41]  Jerffeson Souza,et al.  A Multi-objective Approach to Prioritize and Recommend Bugs in Open Source Repositories , 2016, SSBSE.

[42]  Mohamed Wiem Mkaouer,et al.  Recommending relevant classes for bug reports using multi-objective search , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[43]  Xiao Ma,et al.  From Word Embeddings to Document Similarities for Improved Information Retrieval in Software Engineering , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[44]  Razvan C. Bunescu,et al.  Mapping Bug Reports to Relevant Files: A Ranking Model, a Fine-Grained Benchmark, and Feature Evaluation , 2016, IEEE Transactions on Software Engineering.

[45]  David Lo,et al.  DRONE: Predicting Priority of Reported Bugs by Multi-factor Analysis , 2013, ICSM.

[46]  Sarfraz Khurshid,et al.  Improving bug localization using structured information retrieval , 2013, 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[47]  Jerffeson Souza,et al.  Search-Based Bug Report Prioritization for Kate Editor Bugs Repository , 2015, SSBSE.

[48]  Sarfraz Khurshid,et al.  On the Effectiveness of Information Retrieval Based Bug Localization for C Programs , 2014, 2014 IEEE International Conference on Software Maintenance and Evolution.

[49]  Ahmed E. Hassan,et al.  The Impact of IR-based Classifier Configuration on the Performance and the Effort of Method-Level Bug Localization , 2018, Inf. Softw. Technol..

[50]  Thomas Zimmermann,et al.  Duplicate bug reports considered harmful … really? , 2008, 2008 IEEE International Conference on Software Maintenance.

[51]  Piotr Czyzżak,et al.  Pareto simulated annealing—a metaheuristic technique for multiple‐objective combinatorial optimization , 1998 .

[52]  El-Ghazali Talbi,et al.  Metaheuristics - From Design to Implementation , 2009 .

[53]  Yuanyuan Zhang,et al.  Achievements, Open Problems and Challenges for Search Based Software Testing , 2015, 2015 IEEE 8th International Conference on Software Testing, Verification and Validation (ICST).

[54]  Yves Le Traon,et al.  Chapter Six - Mutation Testing Advances: An Analysis and Survey , 2019, Adv. Comput..