Reduction Techniques for Exemplar-Based Learning Algorithms

Exemplar-based learning algorithms are often faced with the problem of deciding which instances or other exemplars to store for use during generalization. Storing too many exemplars can result in large memory requirements and slow execution speed, and can cause an oversensitivity to noise. This paper has two main purposes. First, it provides a survey of existing algorithms used to reduce the number of exemplars retained in exemplar-based learning models. Second, it proposes six new reduction algorithms called DROP1-5 and DEL that can be used to prune instances from the concept description. These algorithms and 10 algorithms from the survey are compared on 31 datasets. Of those algorithms that provide substantial storage reduction, the DROP algorithms have the highest generalization accuracy in these experiments, especially in the presence of noise.

[1]  Stephen Grossberg,et al.  A massively parallel architecture for a self-organizing neural pattern recognition machine , 1988, Comput. Vis. Graph. Image Process..

[2]  D. Wolpert On Overfitting Avoidance as Bias , 1993 .

[3]  Rm Cameron-Jones,et al.  Instance Selection by Encoding Length Heuristic with Random Mutation Hill Climbing , 1995 .

[4]  C. G. Hilborn,et al.  The Condensed Nearest Neighbor Rule , 1967 .

[5]  David L. Waltz,et al.  Toward memory-based reasoning , 1986, CACM.

[6]  Carla E. Brodley,et al.  Addressing the Selective Superiority Problem: Automatic Algorithm/Model Class Selection , 1993 .

[7]  David G. Lowe,et al.  Similarity Metric Learning for a Variable-Kernel Classifier , 1995, Neural Computation.

[8]  Hugh B. Woodruff,et al.  An algorithm for a selective nearest neighbor decision rule (Corresp.) , 1975, IEEE Trans. Inf. Theory.

[9]  David W. Aha,et al.  Learning Representative Exemplars of Concepts: An Initial Case Study , 1987 .

[10]  David W. Aha,et al.  Tolerating Noisy, Irrelevant and Novel Attributes in Instance-Based Learning Algorithms , 1992, Int. J. Man Mach. Stud..

[11]  R. Hecht-Nielsen Counterpropagation networks. , 1987, Applied optics.

[12]  Cullen Schaffer,et al.  A Conservation Law for Generalization Performance , 1994, ICML.

[13]  Dennis L. Wilson,et al.  Asymptotic Properties of Nearest Neighbor Rules Using Edited Data , 1972, IEEE Trans. Syst. Man Cybern..

[14]  Thomas G. Dietterich Limitations on Inductive Learning , 1989, ML.

[15]  Chin-Liang Chang,et al.  Finding Prototypes For Nearest Neighbor Classifiers , 1974, IEEE Transactions on Computers.

[16]  A. Tversky Features of Similarity , 1977 .

[17]  Bruce G. Batchelor,et al.  Pattern Recognition: Ideas in Practice , 1978 .

[18]  Philip D. Wasserman,et al.  Advanced methods in neural computing , 1993, VNR computer library.

[19]  Pedro M. Domingos Rule Induction and Instance-Based Learning: A Unified Approach , 1995, IJCAI.

[20]  Edwin Diday,et al.  A Recent Advance in Data Analysis: Clustering Objects into Classes Characterized by Conjunctive Concepts , 1981 .

[21]  G. Gates The Reduced Nearest Neighbor Rule , 1998 .

[22]  David S. Broomhead,et al.  Multivariable Functional Interpolation and Adaptive Networks , 1988, Complex Syst..

[23]  Kenneth Steiglitz,et al.  Combinatorial Optimization: Algorithms and Complexity , 1981 .

[24]  I. Tomek An Experiment with the Edited Nearest-Neighbor Rule , 1976 .

[25]  Jianping Zhang,et al.  Selecting Typical Instances in Instance-Based Learning , 1992, ML.

[26]  Tony R. Martinez,et al.  Instance Pruning Techniques , 1997, ICML.

[27]  Farhi Marir,et al.  Case-based reasoning: A review , 1994, The Knowledge Engineering Review.

[28]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[29]  S. Renals,et al.  Phoneme classification experiments using radial basis functions , 1989, International 1989 Joint Conference on Neural Networks.

[30]  D. F. Specht,et al.  Enhancements to probabilistic neural networks , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.

[31]  Tony R. Martinez,et al.  Improved Heterogeneous Distance Functions , 1996, J. Artif. Intell. Res..

[32]  Yoram Biberman,et al.  A Context Similarity Measure , 1994, ECML.

[33]  David B. Skalak,et al.  Prototype and Feature Selection by Sampling and Random Mutation Hill Climbing Algorithms , 1994, ICML.

[34]  Dietrich Wettschereck,et al.  A Hybrid Nearest-Neighbor and Nearest-Hyperrectangle Algorithm , 1994, ECML.

[35]  Sahibsingh A. Dudani The Distance-Weighted k-Nearest-Neighbor Rule , 1976, IEEE Transactions on Systems, Man, and Cybernetics.

[36]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[37]  Christos H. Papadimitriou,et al.  A Worst-Case Analysis of Nearest Neighbor Searching by Projection , 1980, ICALP.