Evaluating the Impact of Sampling-Based Nonlinear Manifold Detection Model on Software Defect Prediction Problem

Accurate prediction of defects is considered an essential factor, depending mainly on how efficiently testing of different prediction models has been done. Earlier, most of the models were restricted to the use of feature selection methods that had limited effects in solving this problem in initial stage of software development. To overcome it, the application of software defect prediction model using modern nonlinear manifold detection (nonlinear MD) combined with SMOTE using four machine learning classification approaches has been proposed in a way that the challenging task of defect prediction has been categorized as problem of high-dimensional datasets, problem of imbalanced class, and identification of most relevant and effective software attributes. Then, statistically evaluated and compared performance of prediction model with or without SMOTE-nonlinear MD approaches and results validated that proposed SMOTE-nonlinear MD approach prediction model predicts defect with better accuracy than others using RMSE, accuracy, and area under the curve.

[1]  Akito Monden,et al.  The Effects of Over and Under Sampling on Fault-prone Module Detection , 2007, ESEM 2007.

[2]  Xin Yao,et al.  Using Class Imbalance Learning for Software Defect Prediction , 2013, IEEE Transactions on Reliability.

[3]  Banu Diri,et al.  Investigating the effect of dataset size, metrics sets, and feature selection techniques on software fault prediction problem , 2009, Inf. Sci..

[4]  Robyn R. Lutz,et al.  Are change metrics good predictors for an evolving software product line? , 2011, Promise '11.

[5]  Deepak Sharma,et al.  Software Fault Prediction Using Machine-Learning Techniques , 2018 .

[6]  Shane McIntosh,et al.  A Large-Scale Study of the Impact of Feature Selection Techniques on Defect Classification Models , 2017, 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR).

[7]  Victor S. Sheng,et al.  Sensitivity of different machine learning algorithms to noise , 2011 .

[8]  Cong Jin,et al.  Software Defect Prediction Scheme Based on Feature Selection , 2012, 2012 Fourth International Symposium on Information Science and Engineering.

[9]  Taghi M. Khoshgoftaar,et al.  Choosing software metrics for defect prediction: an investigation on feature selection techniques , 2011, Softw. Pract. Exp..

[10]  Richard Bellman,et al.  Adaptive Control Processes: A Guided Tour , 1961, The Mathematical Gazette.

[11]  Chastine Fatichah,et al.  Software Fault Prediction Using Filtering Feature Selection in Cluster-Based Classification , 2018 .

[12]  Lech Madeyski,et al.  Towards identifying software project clusters with regard to defect prediction , 2010, PROMISE '10.

[13]  Yeu-Shiang Huang,et al.  A study of software reliability growth from the perspective of learning effects , 2008, Reliab. Eng. Syst. Saf..

[14]  Ajay Rana,et al.  A Nonlinear Manifold Detection based Model for Software Defect Prediction , 2018 .

[15]  Stephen G. MacDonell,et al.  A comparison of techniques for developing predictive models of software metrics , 1997, Inf. Softw. Technol..

[16]  Taghi M. Khoshgoftaar,et al.  Attribute Selection and Imbalanced Data: Problems in Software Defect Prediction , 2010, 2010 22nd IEEE International Conference on Tools with Artificial Intelligence.

[17]  Amri Napolitano,et al.  A comparative study of iterative and non-iterative feature selection techniques for software defect prediction , 2013, Information Systems Frontiers.

[18]  Jeffrey C. Carver,et al.  Characterizing Software Architecture Changes: An Initial Study , 2007, ESEM 2007.

[19]  Hamoud I. Aljamaan,et al.  An empirical study of bagging and boosting ensembles for identifying faulty classes in object-oriented software , 2009, 2009 IEEE Symposium on Computational Intelligence and Data Mining.

[20]  Sandeep Kumar,et al.  Towards an ensemble based system for predicting the number of software faults , 2017, Expert Syst. Appl..

[21]  Suresh Chandra Satapathy,et al.  Cost-effective and fault-resilient reusability prediction model by using adaptive genetic algorithm based neural network for web-of-service applications , 2018, Cluster Computing.

[22]  Sunghun Kim,et al.  Reducing Features to Improve Code Change-Based Bug Prediction , 2013, IEEE Transactions on Software Engineering.