Validity and reliability of evaluation procedures in comparative studies of effort prediction models

We have in previous studies reported our findings and concern about the reliability and validity of the evaluation procedures used in comparative studies on competing effort prediction models. In particular, we have raised concerns about the use of accuracy statistics to rank and select models. Our concern is strengthened by the observed lack of consistent findings. This study offers more insights into the causes of conclusion instability by elaborating on the findings of our previous work concerning the reliability and validity of the evaluation procedures. We show that model selection based on the accuracy statistics MMRE, MMER, MBRE, and MIBRE contribute to conclusion instability as well as selection of inferior models. We argue and show that the evaluation procedure must include an evaluation of whether the functional form of the prediction model makes sense to better prevent selection of inferior models.

[1]  H. E. Dunsmore,et al.  Software engineering metrics and models , 1986 .

[2]  Barry W. Boehm,et al.  Software Engineering Economics , 1993, IEEE Transactions on Software Engineering.

[3]  Stephen G. MacDonell,et al.  What accuracy statistics really measure , 2001, IEE Proc. Softw..

[4]  N. Giri Multivariate Statistical Analysis : Revised And Expanded , 2003 .

[5]  Y. Miyazaki,et al.  Robust regression for developing software estimation models , 1994, J. Syst. Softw..

[6]  David F. Hendry,et al.  The Econometric-analysis of Economic Time-series , 1983 .

[7]  Marcel Korte,et al.  Confidence in software cost estimation results based on MMRE and PRED , 2008, PROMISE '08.

[8]  Sam Kash Kachigan Multivariate statistical analysis: A conceptual introduction , 1982 .

[9]  Emilia Mendes,et al.  Why comparative effort prediction studies may be invalid , 2009, PROMISE '09.

[10]  Edward G. Carmines,et al.  Reliability and Validity Assessment , 1979 .

[11]  Barbara A. Kitchenham,et al.  A Simulation Study of the Model Evaluation Criterion MMRE , 2003, IEEE Trans. Software Eng..

[12]  Magne Jørgensen,et al.  A Systematic Review of Software Development Cost Estimation Studies , 2007, IEEE Transactions on Software Engineering.

[13]  Arvind Sharma,et al.  A Conceptual Introduction , 2001 .

[14]  Ingunn Myrtveit,et al.  A Controlled Experiment to Assess the Benefits of Estimating with Analogy and Regression Models , 1999, IEEE Trans. Software Eng..

[15]  Ingunn Myrtveit,et al.  Reliability and validity in comparative studies of software prediction models , 2005, IEEE Transactions on Software Engineering.

[16]  Rajiv D. Banker,et al.  Scale Economies in New Software Development , 2013, IEEE Transactions on Software Engineering.