An effective change recommendation approach for supplementary bug fixes

Bug fixing is one of the most important activities during software development and maintenance. A substantial number of bugs are often fixed more than once due to incomplete initial fixes which need to be followed up by supplementary fixes. Automatically recommending relevant change locations for supplementary bug fixes can help developers to improve their productivity. It also help improve the reliability of systems by highlighting locations that a developer potentially needs to change to completely remove a bug. Unfortunately, a recent study by Park et al. shows that many change recommendation techniques do not work for supplementary bug fixes. In this paper, to advance the capabilities of existing change recommendation techniques, we propose an effective approach named SupLocator to recommend relevant locations (i.e., methods) that need to be changed for supplementary bug fixes. Based on various relationships among methods, classes, and packages in the source code (such as containment, inheritance, historical co-change, etc.), SupLocator extracts six change relationship graphs. Next, SupLocator performs random walk on each of the 6 graphs, and for each it outputs a ranked list of candidate change locations. Finally, SupLocator combines these six ranked lists by leveraging genetic algorithm. To investigate the benefits of SupLocator, we perform experiments on three projects, i.e., Eclipse JDT, Eclipse SWT, and Equinox p2. The experimental results show that on average SupLocator can achieve top-1, top-5, and top-10 accuracies, mean reciprocal rank (MRR), and mean average precision (MAP) of 0.51, 0.65, 0.67, 0.58 and 0.32 for the three projects, which improve the best variants of the approach proposed by Park et al. by 1523.09, 639.70, 550.62, 919.41, and 1478.44 %, respectively. It also improves the approach proposed by Saul et al. in terms of top-1, top-5, and top-10 accuracies, MRR, and MAP by 71.81, 29.54, 18.30, 47.24, and 56.60 %, respectively. Statistical tests show that the improvements are statistically significant.

[1]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[2]  Yuanyuan Zhang,et al.  Search-based software engineering: Trends, techniques and applications , 2012, CSUR.

[3]  Premkumar T. Devanbu,et al.  Recommending random walks , 2007, ESEC-FSE '07.

[4]  D. West Introduction to Graph Theory , 1995 .

[5]  Martin P. Robillard,et al.  Non-essential changes in version histories , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[6]  Tibor Gyimóthy,et al.  Using information retrieval based coupling measures for impact analysis , 2009, Empirical Software Engineering.

[7]  Andrea De Lucia,et al.  Cross-project defect prediction models: L'Union fait la force , 2014, 2014 Software Evolution Week - IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering (CSMR-WCRE).

[8]  Kalyanmoy Deb,et al.  Multi-objective optimization using evolutionary algorithms , 2001, Wiley-Interscience series in systems and optimization.

[9]  Jane Cleland-Huang,et al.  Improving trace accuracy through data-driven configuration and composition of tracing features , 2013, ESEC/FSE 2013.

[10]  Ken-ichi Matsumoto,et al.  Studying re-opened bugs in open source software , 2012, Empirical Software Engineering.

[11]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[12]  Foutse Khomh,et al.  Supplementary Bug Fixes vs. Re-opened Bugs , 2014, 2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation.

[13]  Jian Zhou,et al.  Where should the bugs be fixed? More accurate information retrieval-based bug localization based on bug reports , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[14]  David Lo,et al.  Automatic, high accuracy prediction of reopened bugs , 2014, Automated Software Engineering.

[15]  Ahmed Tamrawi,et al.  Fuzzy set and cache-based approach for bug triaging , 2011, ESEC/FSE '11.

[16]  Sarfraz Khurshid,et al.  Improving bug localization using structured information retrieval , 2013, 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[17]  Laurie J. Hendren,et al.  Enabling static analysis for partial java programs , 2008, OOPSLA.

[18]  Claire Le Goues,et al.  GenProg: A Generic Method for Automatic Software Repair , 2012, IEEE Transactions on Software Engineering.

[19]  Richard C. Holt,et al.  Predicting change propagation in software systems , 2004, 20th IEEE International Conference on Software Maintenance, 2004. Proceedings..

[20]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[21]  Yang Feng,et al.  Towards more accurate multi-label software behavior learning , 2014, 2014 Software Evolution Week - IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering (CSMR-WCRE).

[22]  Martin P. Robillard,et al.  Concern graphs: finding and describing concerns using structural program dependencies , 2002, Proceedings of the 24th International Conference on Software Engineering. ICSE 2002.

[23]  Qingfu Zhang,et al.  MOEA/D: A Multiobjective Evolutionary Algorithm Based on Decomposition , 2007, IEEE Transactions on Evolutionary Computation.

[24]  Philip J. Guo,et al.  Characterizing and predicting which bugs get reopened , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[25]  Lionel C. Briand,et al.  A practical guide for using statistical tests to assess randomized algorithms in software engineering , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[26]  Andreas Zeller,et al.  The impact of tangled code changes , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[27]  Hajimu Iida,et al.  Who should review my code? A file location-based code-reviewer recommendation approach for Modern Code Review , 2015, 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[28]  Eleni Stroulia,et al.  UMLDiff: an algorithm for object-oriented design differencing , 2005, ASE.

[29]  Dongmei Zhang,et al.  How do software engineers understand code changes?: an exploratory study in industry , 2012, SIGSOFT FSE.

[30]  Tie-Yan Liu,et al.  Learning to Rank for Information Retrieval , 2011 .

[31]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[32]  Miryung Kim,et al.  An empirical study on reducing omission errors in practice , 2014, ASE.

[33]  David Lo,et al.  Cross-language bug localization , 2014, ICPC 2014.

[34]  Brian D. Davison,et al.  Learning to rank for freshness and relevance , 2011, SIGIR.

[35]  Shinji Kusumoto,et al.  CCFinder: A Multilinguistic Token-Based Code Clone Detection System for Large Scale Source Code , 2002, IEEE Trans. Software Eng..

[36]  Hoan Anh Nguyen,et al.  Recurring bug fixes in object-oriented programs , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[37]  Will G. Hopkins,et al.  A new view of statistics , 2002 .

[38]  Andreas Zeller,et al.  Mining Cause-Effect-Chains from Version Histories , 2011, 2011 IEEE 22nd International Symposium on Software Reliability Engineering.

[39]  H. Abdi The Bonferonni and Šidák Corrections for Multiple Comparisons , 2006 .

[40]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[41]  S. N. Sivanandam,et al.  Introduction to genetic algorithms , 2007 .

[42]  Taghi M. Khoshgoftaar,et al.  Evolutionary Optimization of Software Quality Modeling with Multiple Repositories , 2010, IEEE Transactions on Software Engineering.

[43]  Mark Harman,et al.  Searching for better configurations: a rigorous approach to clone evaluation , 2013, ESEC/FSE 2013.

[44]  Miryung Kim,et al.  An empirical study of supplementary bug fixes , 2012, 2012 9th IEEE Working Conference on Mining Software Repositories (MSR).

[45]  Tao Qin,et al.  FRank: a ranking method with fidelity loss , 2007, SIGIR.

[46]  Avinash C. Kak,et al.  Retrieval from software libraries for bug localization: a comparative study of generic and composite text models , 2011, MSR '11.

[47]  Mark Harman,et al.  Search-based software engineering , 2001, Inf. Softw. Technol..

[48]  David E. Goldberg,et al.  Genetic algorithms and Machine Learning , 1988, Machine Learning.

[49]  Richard C. Holt,et al.  Replaying development history to assess the effectiveness of change propagation tools , 2006, Empirical Software Engineering.

[50]  Gail C. Murphy,et al.  Predicting source code changes by mining change history , 2004, IEEE Transactions on Software Engineering.

[51]  Ahmed E. Hassan,et al.  Supporting software evolution using adaptive change propagation heuristics , 2008, 2008 IEEE International Conference on Software Maintenance.

[52]  Gerardo Canfora,et al.  Multi-objective Cross-Project Defect Prediction , 2013, 2013 IEEE Sixth International Conference on Software Testing, Verification and Validation.

[53]  Zhi-Hua Zhou,et al.  Ensemble Methods: Foundations and Algorithms , 2012 .

[54]  Andreas Zeller,et al.  Mining version histories to guide software changes , 2005, Proceedings. 26th International Conference on Software Engineering.

[55]  Andrea De Lucia,et al.  How to effectively use topic models for software engineering tasks? An approach based on Genetic Algorithms , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[56]  Martin P. Robillard,et al.  Automatic generation of suggestions for program investigation , 2005, ESEC/FSE-13.