Complexity Profiling for Informed Case-Base Editing

The contents of the case knowledge container is critical to the performance of case-based classification systems. However the knowledge engineer is given little support in the selection of suitable techniques to maintain and monitor the case-base. In this paper we present a novel technique that provides an insight into the structure of a case-base by means of a complexity profile that can assist maintenance decision-making and provide a benchmark to assess future changes to the case-base. We also introduce a complexity-guided redundancy reduction algorithm which uses a local complexity measure to actively retain cases close to boundaries. The algorithm offers control over the balance between maintaining competence and reducing case-base size. The ability of the algorithm to maintain accuracy in a compacted case-base is demonstrated on seven public domain classification datasets.

[1]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[2]  Barry Smyth,et al.  Building Compact Competent Case-Bases , 1999, ICCBR.

[3]  Luc Lamontagne,et al.  Case-Based Reasoning Research and Development , 1997, Lecture Notes in Computer Science.

[4]  Anthony G. Francis,et al.  Computational Models of the Utility Problem and their Application to a Utility Analysis of Case-Based Reasoning , 1993 .

[5]  Barry Smyth,et al.  Remembering To Forget: A Competence-Preserving Case Deletion Policy for Case-Based Reasoning Systems , 1995, IJCAI.

[6]  Stewart Massie,et al.  Complexity-Guided Case Discovery for Case Based Reasoning , 2005, AAAI.

[7]  Barry Smyth,et al.  A Competence Model for Case-Based Reasoning , 1998 .

[8]  David W. Aha,et al.  Instance-Based Learning Algorithms , 1991, Machine Learning.

[9]  Barry Smyth,et al.  Competence-Guided Case-Base Editing Techniques , 2000, EWCBR.

[10]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[11]  Chris Mellish,et al.  Identifying Competence-Critical Instances for Instance-Based Learners , 2001 .

[12]  I. Tomek An Experiment with the Edited Nearest-Neighbor Rule , 1976 .

[13]  Tony R. Martinez,et al.  Reduction Techniques for Instance-Based Learning Algorithms , 2000, Machine Learning.

[14]  Dennis L. Wilson,et al.  Asymptotic Properties of Nearest Neighbor Rules Using Edited Data , 1972, IEEE Trans. Syst. Man Cybern..

[15]  Barry Smyth,et al.  An Interactive Visualisation Tool for Case-Based Reasoners , 2001, Applied Intelligence.

[16]  I. Tomek,et al.  Two Modifications of CNN , 1976 .

[17]  Siegfried Gottwald Review of "Applications of Fuzzy Sets to Systems Analysis" by Constantin Virgil Negoita and Dan A. Ralescu , 1977, IEEE Trans. Syst. Man Cybern..

[18]  Barry Smyth,et al.  Modelling the Competence of Case-Bases , 1998, EWCBR.

[19]  Barry Smyth,et al.  Advances in Case-Based Reasoning , 1996, Lecture Notes in Computer Science.

[20]  Michael Grüninger,et al.  Introduction , 2002, CACM.

[21]  Padraig Cunningham,et al.  The Utility Problem Analysed: A Case-Based Reasoning Perspective , 1996, EWCBR.

[22]  Chris Mellish,et al.  Advances in Instance Selection for Instance-Based Learning Algorithms , 2002, Data Mining and Knowledge Discovery.

[23]  Padraig Cunningham,et al.  An Analysis of Case-Base Editing in a Spam Filtering System , 2004, ECCBR.

[24]  Peter E. Hart,et al.  The condensed nearest neighbor rule (Corresp.) , 1968, IEEE Trans. Inf. Theory.