Hyperparameter Optimization for Effort Estimation

Software analytics has been widely used in software engineering for many tasks such as generating effort estimates for software projects. One of the "black arts" of software analytics is tuning the parameters controlling a data mining algorithm. Such hyperparameter optimization has been widely studied in other software analytics domains (e.g. defect prediction and text mining) but, so far, has not been extensively explored for effort estimation. Accordingly, this paper seeks simple, automatic, effective and fast methods for finding good tunings for automatic software effort estimation. We introduce a hyperparameter optimization architecture called OIL (Optimized Inductive Learning). We test OIL on a wide range of hyperparameter optimizers using data from 945 software projects. After tuning, large improvements in effort estimation accuracy were observed (measured in terms of standardized accuracy). From those results, we recommend using regression trees (CART) tuned by different evolution combine with default analogy-based estimator. This particular combination of learner and optimizers often achieves in a few hours what other optimizers need days to weeks of CPU time to accomplish. An important part of this analysis is its reproducibility and refutability. All our scripts and data are on-line. It is hoped that this paper will prompt and enable much more research on better methods to tune software effort estimators.

[1]  Yoshua Bengio,et al.  Algorithms for Hyper-Parameter Optimization , 2011, NIPS.

[2]  D. Ross Jeffery,et al.  An Empirical Study of Analogy-based Software Effort Estimation , 1999, Empirical Software Engineering.

[3]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[4]  Adam Trendowicz,et al.  Software Project Effort Estimation , 2014, Springer International Publishing.

[5]  Tim Menzies,et al.  Data Mining Static Code Attributes to Learn Defect Predictors , 2007 .

[6]  Tim Menzies,et al.  "Better Data" is Better than "Better Data Miners" (Benefits of Tuning SMOTE for Defect Prediction) , 2017, ICSE.

[7]  Barry W. Boehm,et al.  Negative results for software effort estimation , 2016, Empirical Software Engineering.

[8]  Bart Baesens,et al.  Data Mining Techniques for Software Effort Estimation: A Comparative Study , 2012, IEEE Transactions on Software Engineering.

[9]  Thong Ngee Goh,et al.  A study of project selection and feature weighting for analogy based software cost estimation , 2009, J. Syst. Softw..

[10]  J. R. Quinlan Learning With Continuous Classes , 1992 .

[11]  Bruce McMillin,et al.  Software engineering: What is it? , 2018, 2018 IEEE Aerospace Conference.

[12]  Ioannis Stamelos,et al.  A Simulation Tool for Efficient Analogy Based Cost Estimation , 2000, Empirical Software Engineering.

[13]  Steve McConnell Software Estimation: Demystifying the Black Art , 2006 .

[14]  Kjetil Moløkken-Østvold,et al.  A review of software surveys on software effort estimation , 2003, 2003 International Symposium on Empirical Software Engineering, 2003. ISESE 2003. Proceedings..

[15]  Emilia Mendes,et al.  Further investigation into the use of CBR and stepwise regression to predict development effort for Web hypermedia applications , 2002, Proceedings International Symposium on Empirical Software Engineering.

[16]  Brajesh Kumar Singh,et al.  Software Effort Estimation by Genetic Algorithm Tuned Parameters of Modified Constructive Cost Model for NASA Software Projects , 2012 .

[17]  Sven Apel,et al.  Finding Faster Configurations Using FLASH , 2018, IEEE Transactions on Software Engineering.

[18]  Barbara A. Kitchenham,et al.  A Simulation Study of the Model Evaluation Criterion MMRE , 2003, IEEE Trans. Software Eng..

[19]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..

[20]  Shane McIntosh,et al.  The Impact of Automated Parameter Optimization on Defect Prediction Models , 2018, IEEE Transactions on Software Engineering.

[21]  Tim Menzies,et al.  What is wrong with topic modeling? And how to fix it using search-based software engineering , 2016, Inf. Softw. Technol..

[22]  Emilia Mendes,et al.  Using tabu search to configure support vector regression for effort estimation , 2013, Empirical Software Engineering.

[23]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[24]  Kjetil Moløkken-Østvold,et al.  Using planning poker for combining expert estimates in software projects , 2008, J. Syst. Softw..

[25]  Isabella Wieczorek,et al.  Resource Estimation in Software Engineering , 2002 .

[26]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[27]  Daniel Ryan Baker,et al.  A Hybrid Approach to Expert and Model Based Effort Estimation , 2007 .

[28]  Chris F. Kemerer,et al.  An empirical validation of software cost estimation models , 1987, CACM.

[29]  Bora Caglayan,et al.  Experiences on Developer Participation and Effort Estimation , 2011, 2011 37th EUROMICRO Conference on Software Engineering and Advanced Applications.

[30]  Tim Menzies,et al.  500+ Times Faster than Deep Learning: (A Case Study Exploring Faster Methods for Text Mining StackOverflow) , 2018, 2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR).

[31]  BaesensBart,et al.  Comprehensible software fault and effort prediction , 2015 .

[32]  Bart Baesens,et al.  Benchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings , 2008, IEEE Transactions on Software Engineering.

[33]  Reidar Conradi,et al.  A Review of Studies on Expert Estimation of Software Development Effort , 2006 .

[34]  Lucas Layman,et al.  LACE2: Better Privacy-Preserving Data Sharing for Cross Project Defect Prediction , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[35]  Shane McIntosh,et al.  Automated Parameter Optimization of Classification Techniques for Defect Prediction Models , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[36]  Marcel Korte,et al.  Confidence in software cost estimation results based on MMRE and PRED , 2008, PROMISE '08.

[37]  Tim Menzies,et al.  Why is Differential Evolution Better than Grid Search for Tuning Defect Predictors? , 2016, ArXiv.

[38]  Tim Menzies,et al.  Tuning for Software Analytics: is it Really Necessary? , 2016, Inf. Softw. Technol..

[39]  Magne Jørgensen,et al.  The Impact of Lessons-Learned Sessions on Effort Estimation and Uncertainty Assessments , 2009, IEEE Transactions on Software Engineering.

[40]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[41]  Taghi M. Khoshgoftaar,et al.  Choosing software metrics for defect prediction: an investigation on feature selection techniques , 2011, Softw. Pract. Exp..

[42]  Tim Menzies,et al.  Transfer learning in effort estimation , 2015, Empirical Software Engineering.

[43]  Tim Menzies,et al.  Data Mining Methods and Cost Estimation Models: Why is it So Hard to Infuse New Ideas? , 2015, 2015 30th IEEE/ACM International Conference on Automated Software Engineering Workshop (ASEW).

[44]  Tim Menzies,et al.  Finding conclusion stability for selecting the best effort predictor in software effort estimation , 2012, Automated Software Engineering.

[45]  Karen T. Lum,et al.  Selecting Best Practices for Effort Estimation , 2006, IEEE Transactions on Software Engineering.

[46]  Michelle Cartwright,et al.  On Building Prediction Systems for Software Engineers , 2000, Empirical Software Engineering.

[47]  Tore Dybå,et al.  A systematic review of effect size in software engineering experiments , 2007, Inf. Softw. Technol..

[48]  Magne Jørgensen,et al.  A review of studies on expert estimation of software development effort , 2004, J. Syst. Softw..

[49]  Mark Harman,et al.  Exact Mean Absolute Error of Baseline Predictor, MARP0 , 2016, Inf. Softw. Technol..

[50]  Tim Menzies,et al.  How to Find Relevant Data for Effort Estimation? , 2011, 2011 International Symposium on Empirical Software Engineering and Measurement.

[51]  Markus Wagner,et al.  Data-Driven Search-Based Software Engineering , 2018, 2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR).

[52]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[53]  Karim O. Elish,et al.  Predicting defect-prone software modules using support vector machines , 2008, J. Syst. Softw..

[54]  Mark Harman,et al.  Multi-objective Software Effort Estimation , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[55]  H. E. Dunsmore,et al.  Software engineering metrics and models , 1986 .

[56]  Thomas G. Dietterich,et al.  Incorporating Expert Feedback into Active Anomaly Discovery , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[57]  Daniel Port,et al.  Comparative studies of the model evaluation criterions mmre and pred in software cost estimation research , 2008, ESEM '08.

[58]  Martin J. Shepperd,et al.  Estimating Software Project Effort Using Analogies , 1997, IEEE Trans. Software Eng..

[59]  Federica Sarro,et al.  Linear Programming as a Baseline for Software Effort Estimation , 2018, ACM Trans. Softw. Eng. Methodol..

[60]  D. Ross Jeffery,et al.  Analogy-X: Providing Statistical Inference to Analogy-Based Software Cost Estimation , 2008, IEEE Transactions on Software Engineering.

[61]  Fredrik Olsson,et al.  A literature survey of active machine learning in the context of natural language processing , 2009 .

[62]  Emilia Mendes,et al.  A Comparative Study of Cost Estimation Models for Web Hypermedia Applications , 2003, Empirical Software Engineering.

[63]  Martin J. Shepperd,et al.  Software project economics: a roadmap , 2007, Future of Software Engineering (FOSE '07).

[64]  Tim Menzies,et al.  On the value of user preferences in search-based software engineering: A case study in software product lines , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[65]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[66]  Lionel C. Briand,et al.  A practical guide for using statistical tests to assess randomized algorithms in software engineering , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[67]  Xin Yao,et al.  The impact of parameter tuning on software effort estimation using learning machines , 2013, PROMISE.

[68]  Sashank Dara,et al.  Online Defect Prediction for Imbalanced Data , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[69]  Gordon Fraser,et al.  Parameter tuning or default values? An empirical investigation in search-based software engineering , 2013, Empirical Software Engineering.

[70]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[71]  Navdeep Kaur,et al.  Tuning of COCOMO Model Parameters by using Bee Colony Optimization , 2015 .

[72]  Bernhard Pfahringer,et al.  Locally Weighted Naive Bayes , 2002, UAI.

[73]  Ayse Basar Bener,et al.  Exploiting the Essential Assumptions of Analogy-Based Effort Estimation , 2012, IEEE Transactions on Software Engineering.

[74]  Stephen G. MacDonell,et al.  What accuracy statistics really measure , 2001, IEE Proc. Softw..

[75]  Ch.V. Phani Krishna,et al.  Multi Objective Particle Swarm Optimization for Software Cost Estimation , 2014 .

[76]  Tim Menzies,et al.  Too much automation? The bellwether effect and its implications for transfer learning , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[77]  Tim Menzies,et al.  Active learning and effort estimation: Finding the essential content of software effort estimation data , 2013, IEEE Transactions on Software Engineering.

[78]  Viljan Mahnic,et al.  On using planning poker for estimating user stories , 2012, J. Syst. Softw..

[79]  Abdel Salam Sayyad,et al.  Pareto-optimal search-based software engineering (POSBSE): A literature survey , 2013, 2013 2nd International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering (RAISE).

[80]  Barry W. Boehm,et al.  Software Engineering Economics , 1993, IEEE Transactions on Software Engineering.

[81]  Tim Menzies,et al.  Heterogeneous Defect Prediction , 2015, IEEE Transactions on Software Engineering.

[82]  Stephen G. MacDonell,et al.  Evaluating prediction systems in software project estimation , 2012, Inf. Softw. Technol..

[83]  Mike Cohn,et al.  Agile Estimating and Planning , 2005 .

[84]  Adam Trendowicz,et al.  Software Project Effort Estimation: Foundations and Best Practice Guidelines for Success , 2014 .

[85]  Chin-Liang Chang,et al.  Finding Prototypes For Nearest Neighbor Classifiers , 1974, IEEE Transactions on Computers.

[86]  Qingfu Zhang,et al.  MOEA/D: A Multiobjective Evolutionary Algorithm Based on Decomposition , 2007, IEEE Transactions on Evolutionary Computation.

[87]  Alaa F. Sheta,et al.  Software effort estimation by tuning COOCMO model parameters using differential evolution , 2010, ACS/IEEE International Conference on Computer Systems and Applications - AICCSA 2010.

[88]  Tim Menzies,et al.  GALE: Geometric Active Learning for Search-Based Software Engineering , 2015, IEEE Transactions on Software Engineering.

[89]  V. Barnett,et al.  Applied Linear Statistical Models , 1975 .

[90]  Witold Pedrycz,et al.  A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[91]  Lefteris Angelis,et al.  Ranking and Clustering Software Cost Estimation Models through a Multiple Comparisons Algorithm , 2013, IEEE Transactions on Software Engineering.

[92]  Peter A. Whigham,et al.  A Baseline Model for Software Effort Estimation , 2015, TSEM.

[93]  Chris Mellish,et al.  A semantically guided and domain-independent evolutionary model for knowledge discovery from texts , 2003, IEEE Trans. Evol. Comput..