Effective Prediction of Bug-Fixing Priority via Weighted Graph Convolutional Networks

With the increasing number of software bugs, bug fixing plays an important role in software development and maintenance. To improve the efficiency of bug resolution, developers utilize bug reports to resolve given bugs. Especially, bug triagers usually depend on bugs’ descriptions to suggest priority levels for reported bugs. However, manual priority assignment is a time-consuming and cumbersome task. To resolve this problem, recent studies have proposed many approaches to automatically predict the priority levels for the reported bugs. Unfortunately, these approaches still face two challenges that include words’ nonconsecutive semantics in bug reports and the imbalanced data. In this article, we propose a novel approach that graph convolutional networks (GCN) based on weighted loss function to perform the priority prediction for bug reports. For the first challenge, we build a heterogeneous text graph for bug reports and apply GCN to extract words’ semantics in bug reports. For the second challenge, we construct a weighted loss function in the training phase. We conduct the priority prediction on four open-source projects, including Mozilla, Eclipse, Netbeans, and GNU compiler collection. Experimental results show that our method outperforms two baseline approaches in terms of the F-measure by weighted average of 13.22%.

[1]  Hui Liu,et al.  CNN-Based Automatic Prioritization of Bug Reports , 2020, IEEE Transactions on Reliability.

[2]  He Jiang,et al.  Mining authorship characteristics in bug repositories , 2014, Science China Information Sciences.

[3]  Siau-Cheng Khoo,et al.  A discriminative model approach for accurate duplicate bug report retrieval , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[4]  Tim Menzies,et al.  Automated severity assessment of software defect reports , 2008, 2008 IEEE International Conference on Software Maintenance.

[5]  Per Runeson,et al.  Detection of Duplicate Defect Reports Using Natural Language Processing , 2007, 29th International Conference on Software Engineering (ICSE'07).

[6]  Xiao-Ming Wu,et al.  Deeper Insights into Graph Convolutional Networks for Semi-Supervised Learning , 2018, AAAI.

[7]  Bixin Li,et al.  Experience report: How do techniques, programs, and tests impact automated program repair? , 2015, 2015 IEEE 26th International Symposium on Software Reliability Engineering (ISSRE).

[8]  Bart Goethals,et al.  Predicting the severity of a reported bug , 2010, 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010).

[9]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[10]  Khalil Sima'an,et al.  Graph Convolutional Encoders for Syntax-aware Neural Machine Translation , 2017, EMNLP.

[11]  Razvan Pascanu,et al.  Relational inductive biases, deep learning, and graph networks , 2018, ArXiv.

[12]  Tao Zhang,et al.  Source code fragment summarization with small-scale crowdsourcing based features , 2015, Frontiers of Computer Science.

[13]  David Lo,et al.  Accurate developer recommendation for bug resolution , 2013, 2013 20th Working Conference on Reverse Engineering (WCRE).

[14]  Letha H. Etzkorn,et al.  Source Code Retrieval for Bug Localization Using Latent Dirichlet Allocation , 2008, 2008 15th Working Conference on Reverse Engineering.

[15]  Nicholas Jalbert,et al.  Automated duplicate detection for bug tracking systems , 2008, 2008 IEEE International Conference on Dependable Systems and Networks With FTCS and DCC (DSN).

[16]  Ming Wen,et al.  An empirical study of bug report field reassignment , 2014, 2014 Software Evolution Week - IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering (CSMR-WCRE).

[17]  Franz Wotawa,et al.  Using Tri-Relation Networks for Effective Software Fault-Proneness Prediction , 2019, IEEE Access.

[18]  Yuan Luo,et al.  Graph Convolutional Networks for Text Classification , 2018, AAAI.

[19]  Tao Xie,et al.  An approach to detecting duplicate bug reports using natural language and execution information , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[20]  Jianxin Li,et al.  Large-Scale Hierarchical Text Classification with Recursively Regularized Deep Graph-CNN , 2018, WWW.

[21]  Tao Zhang,et al.  Bug severity prediction using question-and-answer pairs from Stack Overflow , 2020, J. Syst. Softw..

[22]  Sarfraz Khurshid,et al.  Improving bug localization using structured information retrieval , 2013, 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[23]  Diego Marcheggiani,et al.  Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling , 2017, EMNLP.

[24]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[25]  Min Bai,et al.  Exploiting Semantic Information and Deep Matching for Optical Flow , 2016, ECCV.

[26]  Éric Gaussier,et al.  A Probabilistic Interpretation of Precision, Recall and F-Score, with Implication for Evaluation , 2005, ECIR.

[27]  Cheng-Zen Yang,et al.  An Empirical Study on Improving Severity Prediction of Defect Reports Using Feature Selection , 2012, 2012 19th Asia-Pacific Software Engineering Conference.

[28]  Tao Zhang,et al.  Towards more accurate severity prediction and fixer recommendation of software bugs , 2016, J. Syst. Softw..

[29]  Jian Zhou,et al.  Where should the bugs be fixed? More accurate information retrieval-based bug localization based on bug reports , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[30]  Ran Jin,et al.  Classifying relations in clinical narratives using segment graph convolutional and recurrent neural networks (Seg-GCRNs) , 2018, J. Am. Medical Informatics Assoc..

[31]  Senthil Mani,et al.  AUSUM: approach for unsupervised bug report summarization , 2012, SIGSOFT FSE.

[32]  Gail C. Murphy,et al.  Summarizing software artifacts: a case study of bug reports , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[33]  Serge Demeyer,et al.  Comparing Mining Algorithms for Predicting the Severity of a Reported Bug , 2011, 2011 15th European Conference on Software Maintenance and Reengineering.

[34]  Sarfraz Khurshid,et al.  Are These Bugs Really "Normal"? , 2015, 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories.

[35]  Gail C. Murphy,et al.  Coping with an open bug repository , 2005, eclipse '05.

[36]  Seetha Hari,et al.  Learning From Imbalanced Data , 2019, Advances in Computer and Electrical Engineering.

[37]  John Anvik,et al.  A noun-based approach to feature location using time-aware term-weighting , 2014, Inf. Softw. Technol..

[38]  George Kurian,et al.  Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[39]  Joan Bruna,et al.  Spectral Networks and Locally Connected Networks on Graphs , 2013, ICLR.

[40]  Shadi Banitaan,et al.  Bug Reports Prioritization: Which Features and Classifier to Use? , 2013, 2013 12th International Conference on Machine Learning and Applications.

[41]  Andrew K. C. Wong,et al.  Classification of Imbalanced Data: a Review , 2009, Int. J. Pattern Recognit. Artif. Intell..

[42]  Avinash C. Kak,et al.  Retrieval from software libraries for bug localization: a comparative study of generic and composite text models , 2011, MSR '11.

[43]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[44]  M. Kholief,et al.  Bug fix-time prediction model using naïve Bayes classifier , 2012, 2012 22nd International Conference on Computer Theory and Applications (ICCTA).

[45]  David R. Karger,et al.  Tackling the Poor Assumptions of Naive Bayes Text Classifiers , 2003, ICML.

[46]  Joan Bruna,et al.  Deep Convolutional Networks on Graph-Structured Data , 2015, ArXiv.

[47]  Hui Liu,et al.  Emotion Based Automated Priority Prediction for Bug Reports , 2018, IEEE Access.

[48]  David Lo,et al.  Automated prediction of bug report priority using multi-factor analysis , 2014, Empirical Software Engineering.

[49]  Andreas Zeller,et al.  Where Should We Fix This Bug? A Two-Phase Recommendation Model , 2013, IEEE Transactions on Software Engineering.

[50]  Kevin Chen-Chuan Chang,et al.  A Comprehensive Survey of Graph Embedding: Problems, Techniques, and Applications , 2017, IEEE Transactions on Knowledge and Data Engineering.