SEET:: Software Development Effort Estimation Using Ensemble Techniques

Software development effort estimation (SDEE) is a significant activity in project management and serves as the basis for project bidding, planning, staffing, resource allocation, scheduling, and cost estimation. The accuracy of SDEE techniques varies from project to project, which makes them rather unreliable. In this backdrop, we propose a foundation centered ensemble-based SDEE approach. The primary goal of this approach is to design an ensemble consisting of different machine learning methods for improving the prediction accuracy of SDEE. In recent times, several research results have been reported on machine learning based ensemble design, but extreme learning machine (ELM) and least square support vector regression (LSSVR) have not been used to develop an ensemble. We chose three machine learning techniques, namely ELM, LSSVR, and multilayer perceptron (MLP) as the base techniques to build an ensemble. We investigated the performance of a homogeneous ensemble design using a linear combination rule with standardized accuracy as a weight factor. The performance of the ensemble model is validated and compared with root mean square error (RMSE) based weighted average ensemble model with equivalent configuration. The experimental study was conducted using publicly available PROMISE repository test suite. We achieved promising results for SEET model compared to base learners and RMSE ensemble model.

[1]  Sérgio Soares,et al.  A shift-invariant morphological system for software development cost estimation , 2011, Expert Syst. Appl..

[2]  Jacob Cohen,et al.  A power primer. , 1992, Psychological bulletin.

[3]  Oliver Steinki An investigation of ensemble methods to improve the bias and/or variance of option pricing models based on Lévy processes , 2015 .

[4]  Danny Ho,et al.  Improving the COCOMO model using a neuro-fuzzy approach , 2007, Appl. Soft Comput..

[5]  Heejun Park,et al.  An empirical validation of a neural network model for software effort estimation , 2008, Expert Syst. Appl..

[6]  Sandeep Kumar,et al.  Towards an ensemble based system for predicting the number of software faults , 2017, Expert Syst. Appl..

[7]  Stephen G. MacDonell,et al.  What accuracy statistics really measure , 2001, IEE Proc. Softw..

[8]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[9]  Yong Hu,et al.  Systematic literature review of machine learning based software development effort estimation models , 2012, Inf. Softw. Technol..

[10]  Adriano Lorena Inácio de Oliveira,et al.  Estimation of software project effort with support vector regression , 2006, Neurocomputing.

[11]  Sandeep Kumar,et al.  Linear and non-linear heterogeneous ensemble methods to predict the number of faults in software systems , 2017, Knowl. Based Syst..

[12]  Abbas Heiat,et al.  Comparison of artificial neural network and regression models for estimating software development effort , 2002, Inf. Softw. Technol..

[13]  Xin Yao,et al.  journal homepage: www.elsevier.com/locate/infsof Ensembles and locality: Insight on improving software effort estimation , 2022 .

[14]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.

[15]  Jean-Marc Desharnais,et al.  A comparison of software effort estimation techniques: Using function points with neural networks, case-based reasoning and regression models , 1997, J. Syst. Softw..

[16]  Bart Baesens,et al.  Data Mining Techniques for Software Effort Estimation: A Comparative Study , 2012, IEEE Transactions on Software Engineering.

[17]  Tim Menzies,et al.  oftware effort models should be assessed via leave-one-out validation , 2013 .

[18]  Martin J. Shepperd,et al.  Estimating Software Project Effort Using Analogies , 1997, IEEE Trans. Software Eng..

[19]  Ingunn Myrtveit,et al.  Reliability and validity in comparative studies of software prediction models , 2005, IEEE Transactions on Software Engineering.

[20]  Gavin R. Finnie,et al.  Estimating software development effort with connectionist models , 1997, Inf. Softw. Technol..

[21]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[22]  Senén Barro,et al.  Do we need hundreds of classifiers to solve real world classification problems? , 2014, J. Mach. Learn. Res..

[23]  Magne Jørgensen,et al.  Practical Guidelines for Expert-Judgment-Based Software Effort Estimation , 2005, IEEE Softw..

[24]  Alípio Mário Jorge,et al.  Ensemble approaches for regression: A survey , 2012, CSUR.

[25]  Barbara A. Kitchenham,et al.  Using simulated data sets to compare data analysis techniques used for software cost modelling , 2001, IEE Proc. Softw..

[26]  Barbara A. Kitchenham,et al.  A Simulation Study of the Model Evaluation Criterion MMRE , 2003, IEEE Trans. Software Eng..

[27]  Amrit L. Goel,et al.  Empirical Data Modeling in Software Engineering Using Radical Basis Functions , 2000, IEEE Trans. Software Eng..

[28]  Barry W. Boehm,et al.  Software Engineering Economics , 1993, IEEE Transactions on Software Engineering.

[29]  Stephen G. MacDonell,et al.  Evaluating prediction systems in software project estimation , 2012, Inf. Softw. Technol..

[30]  Danny Ho,et al.  Towards an early software estimation using log-linear regression and a multilayer perceptron model , 2013, J. Syst. Softw..

[31]  Cuauhtemoc Lopez-Martin Predictive accuracy comparison between neural networks and statistical regression for development effort of software projects , 2015 .