Meta-data to Guide Retrieval in CBR for Software Cost Prediction

In recent years, case-based prediction has become a widely advocated technique for software cost estimation. Typically, such approaches are in essence k nearest neighbour methods supported by a case base of a feature vector per software project. Given that software project data is relatively rare – data sets may contain as few as 20 cases – it is common to find a relatively undiscriminating approach to the projects contained within the case bases. We hypothesise that meta-data generated by monitoring case performance can contribute to identifying misleading cases and improve predictions. This paper reports results of a pilot study in which we enriched our case base with meta-data to record performance behaviour of individual data sets. An external fuzzy model was used to classify individual cases as fit or unfit for future use. Misleading cases i.e. with poor predictive ability were seeded into the data set to assess the potential of the approach. Our results show that the model successfully identified the seed cases and refrained from using them during future retrievals.

[1]  Martin Shepperd,et al.  Case and Feature Subset Selection in Case-Based Software Project Effort Prediction , 2003 .

[2]  Sergei Ovchinnikov,et al.  Tutorial on fuzzy logic in simulation , 1985, WSC '85.

[3]  Michelle Cartwright,et al.  Issues on the Effective Use of CBR Technology for Software Project Prediction , 2001, ICCBR.

[4]  Thomas Roth-Berghofer,et al.  Review and Restore for Case‐Base Maintenance , 2001, Comput. Intell..

[5]  Martin Shepperd,et al.  Experiences Using Case-Based Reasoning to Predict Software Project Effort , 2000 .

[6]  Martin J. Shepperd,et al.  Estimating Software Project Effort Using Analogies , 1997, IEEE Trans. Software Eng..

[7]  R. Heery Review of metada formats , 1996 .

[8]  Magne Jørgensen,et al.  Software effort estimation by analogy and "regression toward the mean" , 2003, J. Syst. Softw..

[9]  Michael J. Prietula,et al.  Software-effort estimation with a case-based reasoner , 1996, J. Exp. Theor. Artif. Intell..

[10]  Ioannis Iglezakis The Conflict Graph for Maintaining Case-Based Reasoning Systems , 2001, ICCBR.

[11]  David C. Wilson,et al.  Categorizing Case-Base Maintenance: Dimensions and Directions , 1998, EWCBR.

[12]  Luca Spalazzi,et al.  A Survey on Case-Based Planning , 2004, Artificial Intelligence Review.

[13]  David B. Skalak,et al.  Prototype and Feature Selection by Sampling and Random Mutation Hill Climbing Algorithms , 1994, ICML.

[14]  R. FinnieG.,et al.  A comparison of software effort estimation techniques , 1997 .

[15]  Y. Miyazaki,et al.  Robust regression for developing software estimation models , 1994, J. Syst. Softw..

[16]  Allen S. Parrish,et al.  An Empirical Study Using Task Assignment Patterns to Improve the Accuracy of Software Effort Estimation , 2001, IEEE Trans. Software Eng..

[17]  G. Engels,et al.  HANDBOOK OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING , 2002 .