A replicated comparison of cross-company and within-company effort estimation models using the ISBSG database

Four years ago was the last time the ISBSG database was used to compare the effort prediction accuracy between cross-company and within-company cost models. Since then more than 2,000 projects have been volunteered to this database, which may have changed the trends previously observed. This paper therefore replicates a previous study by investigating how successful a cross-company cost model is: i) to estimate effort for projects that belong to a single company and were not used to build the cross-company model; ii) compared to a within-company cost model. Our within-company data set had data on 184 software projects from a single company and our cross-company data set employed data on 672 software projects. Our results did not corroborate those from the previous study, showing that predictions based on the within-company model were not significantly more accurate than those based on the cross-company model. We analysed the data using forward stepwise regression

[1]  Lionel C. Briand,et al.  A replicated assessment and comparison of common software cost modeling techniques , 2000, Proceedings of the 2000 International Conference on Software Engineering. ICSE 2000 the New Millennium.

[2]  Barbara Kitchenham,et al.  A comparison of cross-company and within-company effort estimation models for Web applications , 2004, ICSE 2004.

[3]  Qinbao Song,et al.  Dealing with missing software project data , 2003, Proceedings. 5th International Workshop on Enterprise Networking and Computing in Healthcare Industry (IEEE Cat. No.03EX717).

[4]  D. Ross Jeffery,et al.  A comparative study of two software development cost modeling techniques using multi-organizational and company-specific data , 2000, Inf. Softw. Technol..

[5]  Lionel C. Briand,et al.  An assessment and comparison of common software cost estimation modeling techniques , 1999, Proceedings of the 1999 International Conference on Software Engineering (IEEE Cat. No.99CB37002).

[6]  Isabella Wieczorek,et al.  How valuable is company-specific data compared to multi-company data for software cost estimation? , 2002, Proceedings Eighth IEEE Symposium on Software Metrics.

[7]  Barbara Kitchenham,et al.  Software cost models , 1984 .

[8]  H. E. Dunsmore,et al.  Software engineering metrics and models , 1986 .

[9]  Soumitra Dutta,et al.  Performance Evaluation of General and Company Specific Models in Software Development Effort Estimation , 1999 .

[10]  Chris F. Kemerer,et al.  An empirical validation of software cost estimation models , 1987, CACM.

[11]  Emilia Mendes,et al.  Further comparison of cross-company and within-company effort estimation models for Web applications , 2004 .

[12]  Barbara A. Kitchenham,et al.  Empirical studies of assumptions that underlie software cost-estimation models , 1992, Inf. Softw. Technol..

[13]  D. Ross Jeffery,et al.  Using public domain metrics to estimate software development effort , 2001, Proceedings Seventh International Software Metrics Symposium.

[14]  Emilia Mendes,et al.  A replicated assessment of the use of adaptation rules to improve Web cost estimation , 2003, 2003 International Symposium on Empirical Software Engineering, 2003. ISESE 2003. Proceedings..

[15]  A. Hossain,et al.  A comparative study on detection of influential observations in linear regression , 1991 .

[16]  Lionel C. Briand,et al.  A replicated Assessment of Common Software Cost Estimation Techniques , 2000, ICSE 2000.

[17]  Stephen G. MacDonell,et al.  What accuracy statistics really measure , 2001, IEE Proc. Softw..

[18]  Barbara A. Kitchenham,et al.  A Procedure for Analyzing Unbalanced Datasets , 1998, IEEE Trans. Software Eng..

[19]  Martin J. Shepperd,et al.  Using Genetic Programming to Improve Software Effort Estimation Based on General Data Sets , 2003, GECCO.

[20]  Martin J. Shepperd,et al.  Making inferences with small numbers of training sets , 2002, IEE Proc. Softw..

[21]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[22]  Martin J. Shepperd,et al.  Using simulation to evaluate prediction techniques [for software] , 2001, Proceedings Seventh International Software Metrics Symposium.

[23]  R. Cook Detection of influential observation in linear regression , 2000 .

[24]  Katrina D. Maxwell,et al.  Applied Statistics for Software Managers , 2002 .

[25]  Martin Shepperd,et al.  Using Simulation to Evaluate Prediction Techniques , 2001 .