Mining Co-location Relationships among Bug Reports to Localize Fault-Prone Modules

Automated bug localization is an important issue in software engineering. In the last few decades, various proactive and reactive localization approaches have been proposed to predict the fault-prone software modules. However, most proactive or reactive approaches need source code information or software complexity metrics to perform localization. In this paper, we propose a reactive approach which considers only bug report information and historical revision logs. In our approach, the co-location relationships among bug reports are explored to improve the prediction accuracy of a state-of-the-art learning method. Studies on three open source projects reveal that the proposed scheme can consistently improve the prediction accuracy in all three software projects by nearly 11.6% on average.

[1]  Andreas Zeller,et al.  Predicting faults from cached history , 2008, ISEC '08.

[2]  Gail C. Murphy,et al.  Who should fix this bug? , 2006, ICSE.

[3]  Osamu Mizuno,et al.  Prediction of Fault-Prone Software Modules Using a Generic Text Discriminator , 2008, IEICE Trans. Inf. Syst..

[4]  N. Nagappan,et al.  Use of relative code churn measures to predict system defect density , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[5]  Cheng-Zen Yang,et al.  Implicit Social Network Model for Predicting and Tracking the Location of Faults , 2008, 2008 32nd Annual IEEE International Computer Software and Applications Conference.

[6]  Dell Zhang,et al.  Web taxonomy integration using support vector machines , 2004, WWW '04.

[7]  John E. Gaffney,et al.  Estimating the Number of Faults in Code , 1984, IEEE Transactions on Software Engineering.

[8]  Gail C. Murphy,et al.  Hipikat: recommending pertinent software development artifacts , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[9]  Harvey P. Siy,et al.  Predicting Fault Incidence Using Software Change History , 2000, IEEE Trans. Software Eng..

[10]  Andreas Zeller,et al.  When do changes induce fixes? , 2005, ACM SIGSOFT Softw. Eng. Notes.

[11]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[12]  Chris F. Kemerer,et al.  A Metrics Suite for Object Oriented Design , 2015, IEEE Trans. Software Eng..

[13]  Victor R. Basili,et al.  A validation of object oriented metrics as quality indicators , 1996 .

[14]  Brian W. Kernighan,et al.  PIC — A language for typesetting graphics , 1982, Softw. Pract. Exp..

[15]  Harald C. Gall,et al.  Populating a Release History Database from version control and bug tracking systems , 2003, International Conference on Software Maintenance, 2003. ICSM 2003. Proceedings..

[16]  Taghi M. Khoshgoftaar,et al.  Detection of software modules with high debug code churn in a very large legacy system , 1996, Proceedings of ISSRE '96: 7th International Symposium on Software Reliability Engineering.

[17]  Thomas Zimmermann,et al.  What Makes a Good Bug Report? , 2008, IEEE Transactions on Software Engineering.

[18]  Richard C. Holt,et al.  The top ten list: dynamic fault prediction , 2005, 21st IEEE International Conference on Software Maintenance (ICSM'05).

[19]  Nachiappan Nagappan,et al.  Predicting defects using network analysis on dependency graphs , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[20]  Cheng-Zen Yang,et al.  Information retrieval on bug locations by learning co-located bug report clusters , 2008, SIGIR '08.

[21]  Chao Liu,et al.  SOBER: statistical model-based bug localization , 2005, ESEC/FSE-13.

[22]  Taghi M. Khoshgoftaar,et al.  Predicting Software Development Errors Using Software Complexity Metrics , 1990, IEEE J. Sel. Areas Commun..

[23]  Yiming Yang,et al.  A re-examination of text categorization methods , 1999, SIGIR '99.

[24]  Letha H. Etzkorn,et al.  Source Code Retrieval for Bug Localization Using Latent Dirichlet Allocation , 2008, 2008 15th Working Conference on Reverse Engineering.

[25]  楊正仁,et al.  Cross-Lingual News Group Recommendation Using Cluster-based Cross-Training , 2008 .

[26]  A. Zeller,et al.  Predicting Defects for Eclipse , 2007, Third International Workshop on Predictor Models in Software Engineering (PROMISE'07: ICSE Workshops 2007).

[27]  Per Runeson,et al.  Detection of Duplicate Defect Reports Using Natural Language Processing , 2007, 29th International Conference on Software Engineering (ICSE'07).

[28]  Tim Menzies,et al.  Data Mining Static Code Attributes to Learn Defect Predictors , 2007, IEEE Transactions on Software Engineering.

[29]  Thomas Zimmermann,et al.  When do changes induce fixes? On Fridays , 2005 .

[30]  Zhendong Su,et al.  Context-aware statistical debugging: from bug predictors to faulty control flow paths , 2007, ASE.

[31]  Norman E. Fenton,et al.  A Critique of Software Defect Prediction Models , 1999, IEEE Trans. Software Eng..

[32]  Ramanath Subramanyam,et al.  Empirical Analysis of CK Metrics for Object-Oriented Design Complexity: Implications for Software Defects , 2003, IEEE Trans. Software Eng..