Within-project and cross-project just-in-time defect prediction based on denoising autoencoder and convolutional neural network

Just-in-time defect prediction is an important and useful branch in software defect prediction. At present, deep learning is a research hotspot in the field of artificial intelligence, which can combine basic defect features into deep semantic features and make up for the shortcomings of machine learning algorithms. However, the mainstream deep learning techniques have not been applied yet in just-in-time defect prediction. Therefore, the authors propose a novel just-in-time defect prediction model named DAECNN-JDP based on denoising autoencoder and convolutional neural network in this study, which has three main advantages: (i) Different weights for the position vector of each dimension feature are set, which can be automatically trained by adaptive trainable vector. (ii) Through the training of denoising autoencoder, the input features that are not contaminated by noise can be obtained, thus learning more robust feature representation. (iii) The authors leverage a powerful representation-learning technique, convolution neural network, to construct the basic change features into the abstract deep semantic features. To evaluate the performance of the DAECNN-JDP model, they conduct extensive within-project and cross-project defect prediction experiments on six large open source projects. The experimental results demonstrate that the superiority of DAECNN-JDP on five evaluation metrics.

[1]  Liqing Zhang,et al.  Feature learning from incomplete EEG with denoising autoencoder , 2014, Neurocomputing.

[2]  Meiqi Wang,et al.  Trust-Aware Collaborative Filtering with a Denoising Autoencoder , 2018, Neural Processing Letters.

[3]  Audris Mockus,et al.  Predicting risk of software changes , 2000, Bell Labs Technical Journal.

[4]  Audris Mockus,et al.  A large-scale empirical study of just-in-time quality assurance , 2013, IEEE Transactions on Software Engineering.

[5]  David Lo,et al.  Revisiting supervised and unsupervised models for effort-aware just-in-time defect prediction , 2018, Empirical Software Engineering.

[6]  Xinli Yang,et al.  TLEL: A two-layer ensemble learning approach for just-in-time defect prediction , 2017, Inf. Softw. Technol..

[7]  Harvey P. Siy,et al.  Predicting Fault Incidence Using Software Change History , 2000, IEEE Trans. Software Eng..

[8]  Ken-ichi Matsumoto,et al.  Studying re-opened bugs in open source software , 2012, Empirical Software Engineering.

[9]  David Lo,et al.  Identifying self-admitted technical debt in open source projects using text mining , 2017, Empirical Software Engineering.

[10]  Naoyasu Ubayashi,et al.  Studying just-in-time defect prediction using cross-project models , 2015, Empirical Software Engineering.

[11]  Hongfang Liu,et al.  An Investigation into the Functional Form of the Size-Defect Relationship for Software Modules , 2009, IEEE Transactions on Software Engineering.

[12]  Pascal Vincent,et al.  A Connection Between Score Matching and Denoising Autoencoders , 2011, Neural Computation.

[13]  Laurie A. Williams,et al.  Evaluating Complexity, Code Churn, and Developer Activity Metrics as Indicators of Software Vulnerabilities , 2011, IEEE Transactions on Software Engineering.

[14]  Dewayne E. Perry,et al.  Toward understanding the rhetoric of small source code changes , 2005, IEEE Transactions on Software Engineering.

[15]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[16]  Yannis Papanikolaou,et al.  Denoising Autoencoder Self-Organizing Map (DASOM) , 2018, Neural Networks.