Empirical Assessment of Machine Learning Techniques for Software Requirements Risk Prediction

Software risk prediction is the most sensitive and crucial activity of Software Development Life Cycle (SDLC). It may lead to the success or failure of a project. The risk should be predicted earlier to make a software project successful. A model is proposed for the prediction of software requirement risks using requirement risk dataset and machine learning techniques. In addition, a comparison is made between multiple classifiers that are K-Nearest Neighbour (KNN), Average One Dependency Estimator (A1DE), Naive Bayes (NB), Composite Hypercube on Iterated Random Projection (CHIRP), Decision Table (DT), Decision Table/Naive Bayes Hybrid Classifier (DTNB), Credal Decision Trees (CDT), Cost-Sensitive Decision Forest (CS-Forest), J48 Decision Tree (J48), and Random Forest (RF) achieve the best suited technique for the model according to the nature of dataset. These techniques are evaluated using various evaluation metrics including CCI (correctly Classified Instances), Mean Absolute Error (MAE), Root Mean Square Error (RMSE), Relative Absolute Error (RAE), Root Relative Squared Error (RRSE), precision, recall, F-measure, Matthew’s Correlation Coefficient (MCC), Receiver Operating Characteristic Area (ROC area), Precision-Recall Curves area (PRC area), and accuracy. The inclusive outcome of this study shows that in terms of reducing error rates, CDT outperforms other techniques achieving 0.013 for MAE, 0.089 for RMSE, 4.498% for RAE, and 23.741% for RRSE. However, in terms of increasing accuracy, DT, DTNB, and CDT achieve better results.

[1]  Ghulam Abbas,et al.  An Empirical Evaluation of Machine Learning Techniques for Chronic Kidney Disease Prophecy , 2020, IEEE Access.

[2]  Zhihao Xu,et al.  Novel Entropy and Rotation Forest-Based Credal Decision Tree Classifier for Landslide Susceptibility Modeling , 2019, Entropy.

[3]  Joaquín Abellán,et al.  Credal decision trees in noisy domains , 2014, ESANN.

[4]  Hossam Faris,et al.  An efficient hybrid multilayer perceptron neural network with grasshopper optimization , 2018, Soft Computing.

[5]  B. Boehm Software risk management: principles and practices , 1991, IEEE Software.

[6]  Weijie Wang,et al.  Analysis of the Mean Absolute Error (MAE) and the Root Mean Square Error (RMSE) in Assessing Rounding Model , 2018 .

[7]  Nazmun Nahar,et al.  Liver Disease Prediction by Using Different Decision Tree Techniques , 2018 .

[8]  Aman Jantan,et al.  State-of-the-art in artificial neural network applications: A survey , 2018, Heliyon.

[9]  Moloud Abdar,et al.  Performance analysis of classification algorithms on early detection of liver disease , 2017, Expert Syst. Appl..

[10]  Ananthram Swami,et al.  The Limitations of Deep Learning in Adversarial Settings , 2015, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[11]  R. Geetha,et al.  Cervical Cancer Identification with Synthetic Minority Oversampling Technique and PCA Analysis using Random Forest Classifier , 2019, Journal of Medical Systems.

[12]  Trevor T. Moores,et al.  A Methodology for Measuring the Risk Associated with A Software Requirements Specification , 1996, Australas. J. Inf. Syst..

[13]  Muhammad Zubair,et al.  A Dataset for Software Requirements Risk Prediction , 2018, 2018 IEEE International Conference on Computational Science and Engineering (CSE).

[14]  John Dhlamini,et al.  Intelligent risk management tools for software development , 2009 .

[15]  Eibe Frank,et al.  Combining Naive Bayes and Decision Tables , 2008, FLAIRS.

[16]  Sylvain Guilley,et al.  Template attack versus Bayes classifier , 2017, Journal of Cryptographic Engineering.

[17]  Xizhao Wang,et al.  A review on neural networks with random weights , 2018, Neurocomputing.

[18]  Binh Thai Pham,et al.  A Novel Classifier Based on Composite Hyper-cubes on Iterated Random Projections for Assessment of Landslide Susceptibility , 2018, Journal of the Geological Society of India.

[19]  Leland Wilkinson,et al.  An L-infinity Norm Visual Classifier , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[20]  Lala Septem Riza,et al.  Detection of kidney disease using various intelligent classifiers , 2017, 2017 3rd International Conference on Science in Information Technology (ICSITech).

[21]  Shomona Jacob,et al.  Software defect prediction in large space systems through hybrid feature selection and classification , 2017, Int. Arab J. Inf. Technol..

[22]  S. S. Vinod Chandra,et al.  Graft survival prediction in liver transplantation using artificial neural network models , 2016, J. Comput. Sci..

[23]  SAGAR S. NIkAM,et al.  A Comparative Study of Classification Techniques in Data Mining Algorithms , 2015 .

[24]  Abdullateef Oluwagbemiga Balogun,et al.  Software Defect Prediction Using Ensemble Learning: An ANP Based Evaluation Method , 2018, FUOYE Journal of Engineering and Technology.

[25]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[27]  Danny Ho,et al.  Analyzing the Non-Functional Requirements in the Desharnais Dataset for Software Effort Estimation , 2014, ArXiv.

[28]  John McNeill,et al.  Proceedings of the 2009 Annual Conference of the Southern African Computer Lecturers' Association , 2009 .

[29]  Shichao Zhang,et al.  Efficient kNN classification algorithm for big data , 2016, Neurocomputing.

[30]  Hany H. Ammar,et al.  Model-Based Resource Utilization and Performance Risk Prediction using Machine Learning Techniques , 2017 .

[31]  Md Zahidul Islam,et al.  Cost Sensitive Decision Forest and Voting for Software Defect Prediction , 2014, PRICAI.

[32]  C. Manjula,et al.  Deep neural network based hybrid approach for software defect prediction using software metrics , 2018, Cluster Computing.

[33]  Jordán Pascual Espada,et al.  Machine learning approach for text and document mining , 2014, ArXiv.

[34]  Tim Menzies,et al.  Data Mining Static Code Attributes to Learn Defect Predictors , 2007, IEEE Transactions on Software Engineering.

[35]  Nayyer Masood,et al.  Dengue Fever Prediction: A Data Mining Problem , 2015 .

[36]  Kozo Watanabe,et al.  Machine learning methods reveal the temporal pattern of dengue incidence using meteorological factors in metropolitan Manila, Philippines , 2018, BMC Infectious Diseases.

[37]  Shai Ben-David,et al.  Understanding Machine Learning: From Theory to Algorithms , 2014 .

[38]  Bin Liu,et al.  Software defect prediction using stacked denoising autoencoders and two-stage ensemble learning , 2017, Inf. Softw. Technol..

[39]  Hoon Jin,et al.  Decision Factors on Effective Liver Patient Data Prediction , 2014, BSBT 2014.

[40]  Leland Wilkinson,et al.  CHIRP: a new classifier based on composite hypercubes on iterated random projections , 2011, KDD.