Comparative study of random search hyper-parameter tuning for software effort estimation

Empirical studies on software effort estimation have employed hyper-parameter tuning algorithms to improve model accuracy and stability. While these tuners can improve model performance, some might be overly complex or costly for the low dimensionality datasets used in SEE. In such cases a method like random search can potentially provide similar benefits as some of the existing tuners, with the advantage of using low amounts of resources and being simple to implement. In this study we evaluate the impact on model accuracy and stability of 12 state-of-the-art hyper-parameter tuning algorithms against random search, on 9 datasets of the PROMISE repository and 4 sub-datasets from the ISBSG R18 dataset. This study covers 2 traditional exhaustive tuners (grid and random searches), 6 bio-inspired algorithms, 2 heuristic tuners, and 3 model-based algorithms. The tuners are used to configure support vector regression, classification and regression trees, and ridge regression models. We aim to determine the techniques and datasets for which certain tuners were 1) more effective than default hyper-parameters, 2) more effective than random search, 3) which models(s) can be considered "the best" for which datasets. The results of this study show that hyper-parameter tuning was effective (increased accuracy and stability) in 862 (51%) of the 1,690 studied scenarios. The 12 state-of-the-art tuners were more effective than random search in 95 (6%) of the 1,560 studied (non-random search) scenarios. Although not effective in every dataset, the combination of flash tuning, logarithm transformation and support vector regression obtained top ranking in accuracy on the highest amount (8 out of 13) of datasets. Hyperband tuned ridge regression with logarithm transformation obtained top ranking in accuracy on the highest amount (10 out of 13) of datasets. We endorse the use of random search as a baseline for comparison for future studies that consider hyper-parameter tuning.

[1]  ShenXipeng,et al.  Tuning for software analytics , 2016 .

[2]  Mark Harman,et al.  Exact Mean Absolute Error of Baseline Predictor, MARP0 , 2016, Inf. Softw. Technol..

[3]  Leandro L. Minku A novel online supervised hyperparameter tuning procedure applied to cross-company software effort estimation , 2019, Empirical Software Engineering.

[4]  Tim Menzies,et al.  Hyperparameter Optimization for Effort Estimation , 2018, ArXiv.

[5]  Shane McIntosh,et al.  The Impact of Automated Parameter Optimization on Defect Prediction Models , 2018, IEEE Transactions on Software Engineering.

[6]  양희영 2005 , 2005, Los 25 años de la OMC: Una retrospectiva fotográfica.

[7]  A. Scott,et al.  A Cluster Analysis Method for Grouping Means in the Analysis of Variance , 1974 .

[8]  Li Yang,et al.  On Hyperparameter Optimization of Machine Learning Algorithms: Theory and Practice , 2020, Neurocomputing.

[9]  Christian Quesada-López,et al.  Hyper-Parameter Tuning of Classification and Regression Trees for Software Effort Estimation , 2021, WorldCIST.

[10]  Barry W. Boehm,et al.  Software development cost estimation approaches — A survey , 2000, Ann. Softw. Eng..

[11]  Zong Woo Geem,et al.  A New Heuristic Optimization Algorithm: Harmony Search , 2001, Simul..

[12]  B. Ripley Classification and Regression Trees , 2015 .

[13]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[14]  Stephen G. MacDonell,et al.  Evaluating prediction systems in software project estimation , 2012, Inf. Softw. Technol..

[15]  Peter J. Bickel,et al.  Maximum Likelihood Estimation of Intrinsic Dimension , 2004, NIPS.

[16]  Sven Apel,et al.  Finding Faster Configurations Using FLASH , 2018, IEEE Transactions on Software Engineering.

[17]  Andries Petrus Engelbrecht,et al.  Fundamentals of Computational Swarm Intelligence , 2005 .

[18]  Duc-Cuong Dang,et al.  Self-adaptation of Mutation Rates in Non-elitist Populations , 2016, PPSN.

[19]  Hongyu Zhang,et al.  Efficient Compiler Autotuning via Bayesian Optimization , 2021, 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE).

[20]  David E. Goldberg,et al.  The compact genetic algorithm , 1999, IEEE Trans. Evol. Comput..

[21]  Shai Ben-David,et al.  Understanding Machine Learning: From Theory to Algorithms , 2014 .

[22]  Ameet Talwalkar,et al.  Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization , 2016, J. Mach. Learn. Res..

[23]  Xin Yao,et al.  The impact of parameter tuning on software effort estimation using learning machines , 2013, PROMISE.

[24]  Xin Yao,et al.  The potential benefit of relevance vector machine to software effort estimation , 2014, PROMISE.

[25]  Manuel Laguna,et al.  Tabu Search , 1997 .

[26]  Shane McIntosh,et al.  Automated Parameter Optimization of Classification Techniques for Defect Prediction Models , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[27]  R. Venkatesan,et al.  Hyperparameters tuning of ensemble model for software effort estimation , 2020 .

[28]  Barry W. Boehm,et al.  Software Engineering Economics , 1993, IEEE Transactions on Software Engineering.

[29]  Bart Baesens,et al.  Data Mining Techniques for Software Effort Estimation: A Comparative Study , 2012, IEEE Transactions on Software Engineering.

[30]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[31]  Kaushal Chari,et al.  An ensemble-based model for predicting agile software development effort , 2018, Empirical Software Engineering.

[32]  Christian Quesada-López,et al.  Evaluating hyper-parameter tuning using random search in support vector machines for software effort estimation , 2020, PROMISE.

[33]  Gang Luo,et al.  A review of automatic selection methods for machine learning algorithms and hyper-parameter values , 2016, Network Modeling Analysis in Health Informatics and Bioinformatics.

[34]  X. C. Guo,et al.  A novel LS-SVMs hyper-parameter selection based on particle swarm optimization , 2008, Neurocomputing.

[35]  Çağatay Çatal,et al.  Performance tuning for machine learning-based software development effort prediction models , 2019, TURKISH JOURNAL OF ELECTRICAL ENGINEERING & COMPUTER SCIENCES.

[36]  Albert L. Lederer,et al.  Causes of inaccurate software development cost estimates , 1995, J. Syst. Softw..

[37]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..

[38]  Yong Hu,et al.  Systematic literature review of machine learning based software development effort estimation models , 2012, Inf. Softw. Technol..

[39]  B. Efron Estimating the Error Rate of a Prediction Rule: Improvement on Cross-Validation , 1983 .