Finding conclusion stability for selecting the best effort predictor in software effort estimation

AbstractBackground:Conclusion Instability in software effort estimation (SEE) refers to the inconsistent results produced by a diversity of predictors using different datasets. This is largely due to the “ranking instability” problem, which is highly related to the evaluation criteria and the subset of the data being used. Aim: To determine stable rankings of different predictors. Method: 90 predictors are used with 20 datasets and evaluated using 7 performance measures, whose results are subject to Wilcoxon rank test (95 %). These results are called the “aggregate results”. The aggregate results are challenged by a sanity check, which focuses on a single error measure (MRE) and uses a newly developed evaluation algorithm called CLUSTER. These results are called the “specific results.” Results: Aggregate results show that: (1) It is now possible to draw stable conclusions about the relative performance of SEE predictors; (2) Regression trees or analogy-based methods are the best performers. The aggregate results are also confirmed by the specific results of the sanity check. Conclusion: This study offers means to address the conclusion instability issue in SEE, which is an important finding for empirical software engineering.

[1]  D. Ross Jeffery,et al.  An Empirical Study of Analogy-based Software Effort Estimation , 1999, Empirical Software Engineering.

[2]  Martin Shepperd,et al.  Case and Feature Subset Selection in Case-Based Software Project Effort Prediction , 2003 .

[3]  Martin Shepperd,et al.  On configuring a case-based reasoning software project prediction system , 2000 .

[4]  Magne Jørgensen,et al.  Practical Guidelines for Expert-Judgment-Based Software Effort Estimation , 2005, IEEE Softw..

[5]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[6]  Daniel Ryan Baker,et al.  A Hybrid Approach to Expert and Model Based Effort Estimation , 2007 .

[7]  Colin Robson,et al.  Real World Research: A Resource for Social Scientists and Practitioner-Researchers , 1993 .

[8]  Martin J. Shepperd,et al.  Comparing Software Prediction Techniques Using Simulation , 2001, IEEE Trans. Software Eng..

[9]  S. T. Buckland,et al.  An Introduction to the Bootstrap. , 1994 .

[10]  Martin J. Shepperd,et al.  Estimating Software Project Effort Using Analogies , 1997, IEEE Trans. Software Eng..

[11]  John E. Gaffney,et al.  Software Function, Source Lines of Code, and Development Effort Prediction: A Software Science Validation , 1983, IEEE Transactions on Software Engineering.

[12]  Katrina D. Maxwell,et al.  Applied Statistics for Software Managers , 2002 .

[13]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[14]  Y. Miyazaki,et al.  Robust regression for developing software estimation models , 1994, J. Syst. Softw..

[15]  Martin J. Shepperd,et al.  Making inferences with small numbers of training sets , 2002, IEE Proc. Softw..

[16]  B. Kitchenham,et al.  Inter-item correlations among function points , 1993, Proceedings of 1993 15th International Conference on Software Engineering.

[17]  D. Ross Jeffery,et al.  Analogy-X: Providing Statistical Inference to Analogy-Based Software Cost Estimation , 2008, IEEE Transactions on Software Engineering.

[18]  Ingunn Myrtveit,et al.  Reliability and validity in comparative studies of software prediction models , 2005, IEEE Transactions on Software Engineering.

[19]  Uri Lipowezky Selection of the optimal prototype subset for 1-NN classification , 1998, Pattern Recognit. Lett..

[20]  Günther Ruhe,et al.  A comparative study of attribute weighting heuristics for effort estimation by analogy , 2006, ISESE '06.

[21]  Guilherme Horta Travassos,et al.  Cross versus Within-Company Cost Estimation Studies: A Systematic Review , 2007, IEEE Transactions on Software Engineering.

[22]  Chin-Liang Chang,et al.  Finding Prototypes For Nearest Neighbor Classifiers , 1974, IEEE Transactions on Computers.

[23]  C. Wohlin,et al.  Distribution patterns of effort estimations , 2004 .

[24]  Tim Menzies,et al.  Case-based reasoning vs parametric models for software quality optimization , 2010, PROMISE '10.

[25]  Karen T. Lum,et al.  Stable rankings for different effort models , 2010, Automated Software Engineering.

[26]  Leo Breiman,et al.  Technical note: Some properties of splitting criteria , 2004, Machine Learning.

[27]  F KemererChris An empirical validation of software cost estimation models , 1987 .

[28]  Jacky W. Keung,et al.  Empirical evaluation of analogy-x for software cost estimation , 2008, ESEM '08.

[29]  Geoff Holmes,et al.  Benchmarking Attribute Selection Techniques for Discrete Class Data Mining , 2003, IEEE Trans. Knowl. Data Eng..

[30]  Magne Jørgensen,et al.  A review of studies on expert estimation of software development effort , 2004, J. Syst. Softw..

[31]  Jingzhou Li,et al.  Decision Support Analysis for Software Effort Estimation by Analogy , 2007, Third International Workshop on Predictor Models in Software Engineering (PROMISE'07: ICSE Workshops 2007).

[32]  Chris F. Kemerer,et al.  An empirical validation of software cost estimation models , 1987, CACM.

[33]  Reidar Conradi,et al.  A Review of Studies on Expert Estimation of Software Development Effort , 2006 .

[34]  Jack P. C. Kleijnen,et al.  Sensitivity analysis and related analyses: A review of some statistical techniques , 1997 .

[35]  Ethem Alpaydin,et al.  Introduction to machine learning , 2004, Adaptive computation and machine learning.

[36]  Stefan Biffl,et al.  Optimal project feature weights in analogy-based cost estimation: improvement and limitations , 2006, IEEE Transactions on Software Engineering.

[37]  Barbara A. Kitchenham,et al.  Experiments with Analogy-X for Software Cost Estimation , 2008, 19th Australian Conference on Software Engineering (aswec 2008).

[38]  Ayse Basar Bener,et al.  A new perspective on data homogeneity in software cost estimation: a study in the embedded systems domain , 2010, Software Quality Journal.

[39]  Thong Ngee Goh,et al.  A study of project selection and feature weighting for analogy based software cost estimation , 2009, J. Syst. Softw..

[40]  Barbara A. Kitchenham,et al.  Effort estimation using analogy , 1996, Proceedings of IEEE 18th International Conference on Software Engineering.

[41]  Ying Yang,et al.  A comparative study of discretization methods for naive-Bayes classifiers , 2002 .

[42]  Barbara A. Kitchenham,et al.  A Simulation Study of the Model Evaluation Criterion MMRE , 2003, IEEE Trans. Software Eng..

[43]  Jack P. C. Kleijnen Sensitivity analysis and related analysis : A survey of statistical techniques , 1995 .

[44]  Barry W. Boehm,et al.  Software Engineering Economics , 1993, IEEE Transactions on Software Engineering.

[45]  João Gama,et al.  Discretization from data streams: applications to histograms and data mining , 2006, SAC.

[46]  Emilia Mendes,et al.  A Comparative Study of Cost Estimation Models for Web Hypermedia Applications , 2003, Empirical Software Engineering.