Investigating the use of duration‐based windows and estimation by analogy for COCOMO

In model‐based software estimation, using the right training data is a key contributor for making accurate predictions, which is crucial for the success of software projects. This study investigates the use of duration‐based windows and estimation by analogy to calibrate COCOMO and assess their estimation performance. We compare these approaches as well as the use of all available historical data using the COCOMO data set of 341 projects and NASA data set of 93 projects. The results show that timing information exists in the data sets affecting estimation accuracy. Given sufficient data for calibration, using recently completed projects within short durations generates more accurate estimates than retaining all historical data or using k‐nearest neighbors based on estimation by analogy. More training data spanning a long period of time may not lead to improved estimation accuracy. This study offers evidence to support the use of projects completed within recent years for training estimation models.

[1]  N. Cliff Dominance statistics: Ordinal analyses to answer ordinal questions. , 1993 .

[2]  Peter A. Whigham,et al.  A Baseline Model for Software Effort Estimation , 2015, TSEM.

[3]  Barry W. Boehm,et al.  Negative results for software effort estimation , 2016, Empirical Software Engineering.

[4]  Emilia Mendes,et al.  Investigating the use of moving windows to improve software effort prediction: a replicated study , 2017, Empirical Software Engineering.

[5]  Jürgen Münch,et al.  Factors Influencing Software Development Productivity - State-of-the-Art and Industrial Experiences , 2009, Adv. Comput..

[6]  Stephen G. MacDonell,et al.  Data accumulation and software effort prediction , 2010, ESEM '10.

[7]  Emilia Mendes,et al.  Investigating the Use of Duration-Based Moving Windows to Improve Software Effort Prediction , 2012, 2012 19th Asia-Pacific Software Engineering Conference.

[8]  Emilia Mendes,et al.  Investigating the Use of Chronological Splitting to Compare Software Cross-company and Single-company Effort Predictions: A Replicated Study , 2009, EASE.

[9]  Sousuke Amasaki,et al.  Performance Evaluation of Windowing Approach on Effort Estimation by Analogy , 2011, 2011 Joint Conference of the 21st International Workshop on Software Measurement and the 6th International Conference on Software Process and Product Measurement.

[10]  Forrest Shull,et al.  Local versus Global Lessons for Defect Prediction and Effort Estimation , 2013, IEEE Transactions on Software Engineering.

[11]  Shari Lawrence Pfleeger,et al.  An empirical study of maintenance and development estimation accuracy , 2002, J. Syst. Softw..

[12]  Navdeep Kaur,et al.  Research patterns and trends in software effort estimation , 2017, Inf. Softw. Technol..

[13]  Tim Menzies,et al.  Finding conclusion stability for selecting the best effort predictor in software effort estimation , 2012, Automated Software Engineering.

[14]  Xin Yao,et al.  Which models of the past are relevant to the present? A software effort estimation approach to exploiting useful past models , 2016, Automated Software Engineering.

[15]  Martin J. Shepperd,et al.  Using Genetic Programming to Improve Software Effort Estimation Based on General Data Sets , 2003, GECCO.

[16]  Emilia Mendes,et al.  Applying moving windows to software effort estimation , 2009, 2009 3rd International Symposium on Empirical Software Engineering and Measurement.

[17]  Stefan Biffl,et al.  Optimal project feature weights in analogy-based cost estimation: improvement and limitations , 2006, IEEE Transactions on Software Engineering.

[18]  Barry W. Boehm,et al.  Software Engineering Economics , 1993, IEEE Transactions on Software Engineering.

[19]  Sousuke Amasaki,et al.  A Replication of Comparative Study of Moving Windows on Linear Regression and Estimation by Analogy , 2015, PROMISE.

[20]  Stephen G. MacDonell,et al.  Evaluating prediction systems in software project estimation , 2012, Inf. Softw. Technol..

[21]  Tsun Chow,et al.  A survey study of critical success factors in agile software projects , 2008, J. Syst. Softw..

[22]  Barry W. Boehm,et al.  Analyzing and handling local bias for calibrating parametric cost estimation models , 2013, Inf. Softw. Technol..

[23]  Xin Yao,et al.  Using unreliable data for creating more reliable online learners , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[24]  Barry W. Boehm,et al.  Determining relevant training data for effort estimation using Window-based COCOMO calibration , 2019, J. Syst. Softw..

[25]  Barry W. Boehm,et al.  Calibrating the COCOMO II Post-Architecture model , 1998, Proceedings of the 20th International Conference on Software Engineering.

[26]  Emilia Mendes,et al.  Investigating the use of duration-based moving windows to improve software effort prediction: A replicated study , 2014, Inf. Softw. Technol..

[27]  Sousuke Amasaki,et al.  How to treat timing information for software effort estimation? , 2013, ICSSP 2013.

[28]  Weidong Xia,et al.  Grasping the complexity of IS development projects , 2004, CACM.

[29]  Karen T. Lum,et al.  Selecting Best Practices for Effort Estimation , 2006, IEEE Transactions on Software Engineering.

[30]  Emilia Mendes,et al.  Investigating the use of chronological splitting to compare software cross-company and single-company effort predictions , 2008 .

[31]  Sousuke Amasaki,et al.  On the effectiveness of weighted moving windows: Experiment on linear regression based software effort estimation , 2015, J. Softw. Evol. Process..

[32]  Xin Yao,et al.  Can cross-company data improve performance in software effort estimation? , 2012, PROMISE '12.

[33]  Lionel C. Briand,et al.  A replicated assessment and comparison of common software cost modeling techniques , 2000, Proceedings of the 2000 International Conference on Software Engineering. ICSE 2000 the New Millennium.

[34]  Barbara A. Kitchenham,et al.  An empirical analysis of software productivity over time , 2005, 11th IEEE International Software Metrics Symposium (METRICS'05).

[35]  Barry W. Boehm,et al.  Finding the right data for software cost modeling , 2005, IEEE Software.

[36]  Danilo Caivano,et al.  Software renewal process comprehension using dynamic effort estimation , 2001, Proceedings IEEE International Conference on Software Maintenance. ICSM 2001.

[37]  Jürgen Münch,et al.  State of the Practice in Software Effort Estimation: A Survey and Literature Review , 2008, CEE-SET.

[38]  Barry W. Boehm,et al.  A constrained regression technique for cocomo calibration , 2008, ESEM '08.

[39]  Stephen G. MacDonell,et al.  What accuracy statistics really measure , 2001, IEE Proc. Softw..

[40]  Barry Boehm,et al.  A view of 20th and 21st century software engineering , 2006, ICSE.

[41]  Barry W. Boehm,et al.  An analysis of trends in productivity and cost drivers over years , 2011, Promise '11.

[42]  Binish Tanveer,et al.  Effort estimation in agile software development: Case study and improvement framework , 2017, J. Softw. Evol. Process..

[43]  Jacob Cohen,et al.  A power primer. , 1992, Psychological bulletin.

[44]  Cuauhtémoc López Martín,et al.  Software development effort prediction of industrial projects applying a general regression neural network , 2011, Empirical Software Engineering.

[45]  Evelina Ericsson,et al.  Quantifying Success Factors for IT Projects—An Expert-Based Bayesian Model , 2014, Inf. Syst. Manag..

[46]  Peter I. Cowling,et al.  Software Stage-Effort Estimation Based on Association Rule Mining and Fuzzy Set Theory , 2010, 2010 10th IEEE International Conference on Computer and Information Technology.

[47]  Sousuke Amasaki,et al.  The Effects of Moving Windows to Software Estimation: Comparative Study on Linear Regression and Estimation by Analogy , 2012, 2012 Joint Conference of the 22nd International Workshop on Software Measurement and the 2012 Seventh International Conference on Software Process and Product Measurement.

[48]  Ellis Horowitz,et al.  Software Cost Estimation with COCOMO II , 2000 .

[49]  Magne Jørgensen,et al.  A Systematic Review of Software Development Cost Estimation Studies , 2007, IEEE Transactions on Software Engineering.

[50]  Tim Menzies,et al.  Transfer learning in effort estimation , 2015, Empirical Software Engineering.

[51]  Xin Yao,et al.  The impact of parameter tuning on software effort estimation using learning machines , 2013, PROMISE.