Decision Support Analysis for Software Effort Estimation by Analogy

Effort estimation by analogy (EBA) is an established method for software effort estimation. For this paper, we understand EBA as a meta-method which needs to be instantiated and customized at different stages and decision points regarding a specific context. Some example decision problems are related to the selection of the similarity measures, the selection of analogs for adaptation or the weighting and selection of attributes. This paper proposes a decision-centric process model for EBA by generalizing the existing EBA methods. Typical decision-making problems are identified at different stages of the process as part of the model. Some existing solution alternatives of the decision-making problems are then studied. The results of the decision support analysis can be used for better understanding of EBA related techniques and for providing guidelines for implementation and customization of general EBA. An example case of the process model is finally presented.

[1]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[2]  Martin J. Shepperd,et al.  Making inferences with small numbers of training sets , 2002, IEE Proc. Softw..

[3]  Randy Kerber,et al.  ChiMerge: Discretization of Numeric Attributes , 1992, AAAI.

[4]  Ingunn Myrtveit,et al.  Analyzing Data Sets with Missing Data: An Empirical Evaluation of Imputation Methods and Likelihood-Based Methods , 2001, IEEE Trans. Software Eng..

[5]  Günther Ruhe,et al.  Software Engineering Decision Support ? A New Paradigm for Learning Software Organizations , 2002, LSO.

[6]  Ning Zhong,et al.  Using Rough Sets with Heuristics for Feature Selection , 1999, Journal of Intelligent Information Systems.

[7]  H. D. Rombach,et al.  The Goal Question Metric Approach , 1994 .

[8]  Chris F. Kemerer,et al.  An empirical validation of software cost estimation models , 1987, CACM.

[9]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[10]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[11]  Martin Shepperd,et al.  Case and Feature Subset Selection in Case-Based Software Project Effort Prediction , 2003 .

[12]  Emilia Mendes,et al.  A Comparative Study of Cost Estimation Models for Web Hypermedia Applications , 2003, Empirical Software Engineering.

[13]  Andrew K. C. Wong,et al.  Class-Dependent Discretization for Inductive Learning from Continuous and Mixed-Mode Data , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Robert C. Holte,et al.  Very Simple Classification Rules Perform Well on Most Commonly Used Datasets , 1993, Machine Learning.

[15]  H. E. Dunsmore,et al.  Software engineering metrics and models , 1986 .

[16]  D. Ross Jeffery,et al.  An Empirical Study of Analogy-based Software Effort Estimation , 1999, Empirical Software Engineering.

[17]  H. D. Rombach,et al.  THE EXPERIENCE FACTORY , 1999 .

[18]  Martin Shepperd,et al.  Experiences Using Case-Based Reasoning to Predict Software Project Effort , 2000 .

[19]  Michael M. Richter,et al.  On the Notion of Similarity in Case-Based Reasoning , 1995 .

[20]  Karen T. Lum,et al.  Selecting Best Practices for Effort Estimation , 2006, IEEE Transactions on Software Engineering.

[21]  Michael M. Richter,et al.  A flexible method for software effort estimation by analogy , 2007, Empirical Software Engineering.

[22]  Julius T. Tou,et al.  Pattern Recognition Principles , 1974 .

[23]  Günther Ruhe,et al.  Impact Analysis of Missing Values on the Prediction Accuracy of Analogy-based Software Effort Estimation Method AQUA , 2007, First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007).

[24]  Ashwin Ram,et al.  Systematic Evaluation of Design Decisions in Case-Based Reasoning Systems , 1993 .

[25]  Martin Höst,et al.  A Snapshot of the State of Practice in Software Development for Medical Devices , 2007, ESEM 2007.

[26]  Guenther Ruhe,et al.  Impact Analysis of Missing Values on the Prediction Accuracy of Analogy-based Software Effort Estimation Method AQUA , 2007, ESEM 2007.

[27]  Barry W. Boehm,et al.  Finding the right data for software cost modeling , 2005, IEEE Software.

[28]  Jerzy W. Grzymala-Busse,et al.  Global discretization of continuous attributes as preprocessing for machine learning , 1996, Int. J. Approx. Reason..

[29]  John Riedl,et al.  Item-based collaborative filtering recommendation algorithms , 2001, WWW '01.

[30]  Günther Ruhe,et al.  Software Effort Estimation by Analogy Using Attribute Selection Based on Rough Set Analysis , 2008, Int. J. Softw. Eng. Knowl. Eng..

[31]  Michael J. Prietula,et al.  Examining the Feasibility of a Case-Based Reasoning Model for Software Effort Estimation , 1992, MIS Q..

[32]  Ian Witten,et al.  Data Mining , 2000 .

[33]  Andrew K. C. Wong,et al.  Synthesizing Statistical Knowledge from Incomplete Mixed-Mode Data , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Xindong Wu,et al.  A Bayesian Discretizer for Real-Valued Attributes , 1996, Comput. J..

[35]  Günther Ruhe,et al.  A comparative study of attribute weighting heuristics for effort estimation by analogy , 2006, ISESE '06.

[36]  Khaled El Emam,et al.  Software Cost Estimation with Incomplete Data , 2001, IEEE Trans. Software Eng..

[37]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[38]  Sun-Jen Huang,et al.  Optimization of analogy weights by genetic algorithm for software effort estimation , 2006, Inf. Softw. Technol..

[39]  Martin J. Shepperd,et al.  Estimating Software Project Effort Using Analogies , 1997, IEEE Trans. Software Eng..

[40]  Ingunn Myrtveit,et al.  Reliability and validity in comparative studies of software prediction models , 2005, IEEE Transactions on Software Engineering.

[41]  Hareton K. N. Leung,et al.  Estimating Maintenance Effort by Analogy , 2002, Empirical Software Engineering.

[42]  Ron Kohavi,et al.  Supervised and Unsupervised Discretization of Continuous Features , 1995, ICML.

[43]  Qinbao Song,et al.  Dealing with missing software project data , 2003, Proceedings. 5th International Workshop on Enterprise Networking and Computing in Healthcare Industry (IEEE Cat. No.03EX717).

[44]  M. Zhang,et al.  A rough sets based approach to feature selection , 2004, IEEE Annual Meeting of the Fuzzy Information, 2004. Processing NAFIPS '04..