Machine learning based software fault prediction utilizing source code metrics

In the conventional techniques, it requires prior knowledge of faults or a special structure, which may not be realistic in practice while detecting the software faults. To deal with this problem, in this work, the proposed approach aims to predict the faults of the software utilizing the source code metrics. In addition, the purpose of this paper is to measure the capability of the software fault predictability in terms of accuracy, f-measure, precision, recall, Area Under ROC (Receiver Operating Characteristic) Curve (AUC). The study investigates the effect of the feature selection techniques for software fault prediction. As an experimental analysis, our proposed approach is validated from four publicly available datasets. The result predicted from Random Forest technique outperforms the other machine learning techniques in most of the cases. The effect of the feature selection techniques has increased the performance in few cases, however, in the maximum cases it is negligible or even the worse.

[1]  Guru Prasad Bhandari,et al.  Fault analysis of service-oriented systems: a systematic literature review , 2018, IET Softw..

[2]  Alexandre Boucher,et al.  Software metrics thresholds calculation techniques to predict fault-proneness: An empirical comparison , 2017, Inf. Softw. Technol..

[3]  Bill Curtis,et al.  Measuring the Psychological Complexity of Software Maintenance Tasks with the Halstead and McCabe Metrics , 1979, IEEE Transactions on Software Engineering.

[4]  Mark Lorenz,et al.  Object-oriented software metrics - a practical guide , 1994 .

[5]  Richard Torkar,et al.  Software fault prediction metrics: A systematic literature review , 2013, Inf. Softw. Technol..

[6]  Santanu Kumar Rath,et al.  Using source code metrics to predict change-prone web services: A case-study on ebay services , 2017, 2017 IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE).

[7]  Bhekisipho Twala,et al.  Predicting Software Faults in Large Space Systems using Machine Learning Techniques , 2011 .

[8]  Tim Menzies,et al.  The \{PROMISE\} Repository of Software Engineering Databases. , 2005 .

[9]  Santanu Kumar Rath,et al.  The impact of feature selection on maintainability prediction of service-oriented applications , 2016, Service Oriented Computing and Applications.

[10]  Bart Baesens,et al.  Comprehensible software fault and effort prediction: A data mining approach , 2015, J. Syst. Softw..

[11]  Ali Selamat,et al.  A survey on software fault detection based on different prediction approaches , 2014, Vietnam Journal of Computer Science.

[12]  Robert X. Gao,et al.  PCA-based feature selection scheme for machine defect classification , 2004, IEEE Transactions on Instrumentation and Measurement.

[13]  Santanu Kumar Rath,et al.  Effective fault prediction model developed using Least Square Support Vector Machine (LSSVM) , 2017, J. Syst. Softw..

[14]  Guru Prasad Bhandari,et al.  Extended Fault Taxonomy of SOA-Based Systems , 2017, J. Comput. Inf. Technol..

[15]  Chris F. Kemerer,et al.  A Metrics Suite for Object Oriented Design , 2015, IEEE Trans. Software Eng..

[16]  Ruchika Malhotra,et al.  Comparative analysis of statistical and machine learning methods for predicting faulty modules , 2014, Appl. Soft Comput..

[17]  Carl E. Landwehr,et al.  Basic concepts and taxonomy of dependable and secure computing , 2004, IEEE Transactions on Dependable and Secure Computing.

[18]  Paulo S. C. Alencar,et al.  The use of machine learning algorithms in recommender systems: A systematic review , 2015, Expert Syst. Appl..

[19]  J. Hagen Money growth targeting by the Bundesbank , 1999 .

[20]  Ruchika Malhotra,et al.  Fault Prediction Using Statistical and Machine Learning Methods for Improving Software Quality , 2012, J. Inf. Process. Syst..