Hit Miss Networks with Applications to Instance Selection

In supervised learning, a training set consisting of labeled instances is used by a learning algorithm for generating a model (classifier) that is subsequently employed for deciding the class label of new instances (for generalization). Characteristics of the training set, such as presence of noisy instances and size, influence the learning algorithm and affect generalization performance. This paper introduces a new network-based representation of a training set, called hit miss network (HMN), which provides a compact description of the nearest neighbor relation over pairs of instances from each pair of classes. We show that structural properties of HMN's correspond to properties of training points related to the one nearest neighbor (1-NN) decision rule, such as being border or central point. This motivates us to use HMN's for improving the performance of a 1-NN, classifier by removing instances from the training set (instance selection). We introduce three new HMN-based algorithms for instance selection. HMN-C, which removes instances without affecting accuracy of 1-NN on the original training set, HMN-E, based on a more aggressive storage reduction, and HMN-EI, which applies iteratively HMN-E. Their performance is assessed on 22 data sets with different characteristics, such as input dimension, cardinality, class balance, number of classes, noise content, and presence of redundant variables. Results of experiments on these data sets show that accuracy of 1-NN classifier increases significantly when HMN-EI is applied. Comparison with state-of-the-art editing algorithms for instance selection on these data sets indicates best generalization performance of HMN-EI and no significant difference in storage requirements. In general, these results indicate that HMN's provide a powerful graph-based representation of a training set, which can be successfully applied for performing noise and redundance reduction in instance-based learning.

[1]  Filiberto Pla,et al.  Prototype selection for the nearest neighbour rule through proximity graphs , 1997, Pattern Recognit. Lett..

[2]  Chris Mellish,et al.  Advances in Instance Selection for Instance-Based Learning Algorithms , 2002, Data Mining and Knowledge Discovery.

[3]  Godfried T. Toussaint,et al.  Proximity Graphs for Nearest Neighbor Decision Rules: Recent Progress , 2002 .

[4]  Chris Mellish,et al.  On the Consistency of Information Filters for Lazy Learning Algorithms , 1999, PKDD.

[5]  Tony R. Martinez,et al.  Instance Pruning Techniques , 1997, ICML.

[6]  Carey E. Priebe,et al.  Classification Using Class Cover Catch Digraphs , 2003, J. Classif..

[7]  C. E. Priebe,et al.  The use of domination number of a random proximity catch digraph for testing spatial patterns of segregation and association , 2005 .

[8]  David J. Marchette,et al.  12 - Fast Algorithms for Classification Using Class Cover Catch Digraphs , 2005 .

[9]  Marek Grochowski,et al.  Comparison of Instance Selection Algorithms II. Results and Comments , 2004, ICAISC.

[10]  Godfried T. Toussaint,et al.  Geometric Decision Rules for Instance-Based Learning Problems , 2005, PReMI.

[11]  Carey E. Priebe,et al.  A new family of proximity graphs: Class cover catch digraphs , 2006, Discret. Appl. Math..

[12]  Fabrizio Angiulli,et al.  Fast Nearest Neighbor Condensation for Large Data Sets Classification , 2007, IEEE Transactions on Knowledge and Data Engineering.

[13]  I. Tomek An Experiment with the Edited Nearest-Neighbor Rule , 1976 .

[14]  Sungzoon Cho,et al.  Neighborhood Property-Based Pattern Selection for Support Vector Machines , 2007, Neural Comput..

[15]  Robert P. W. Duin,et al.  Prototype selection for dissimilarity-based classifiers , 2006, Pattern Recognit..

[16]  Marek Grochowski,et al.  Comparison of Instances Seletion Algorithms I. Algorithms Survey , 2004, ICAISC.

[17]  Hanan Samet,et al.  A fast all nearest neighbor algorithm for applications involving large point-clouds , 2007, Comput. Graph..

[18]  SametHanan,et al.  A fast all nearest neighbor algorithm for applications involving large point-clouds , 2007 .

[19]  Patrick Grother,et al.  Fast implementations of nearest neighbor classifiers , 1997, Pattern Recognit..

[20]  Belur V. Dasarathy,et al.  Minimal consistent set (MCS) identification for optimal nearest neighbor decision systems design , 1994, IEEE Trans. Syst. Man Cybern..

[21]  C. G. Hilborn,et al.  The Condensed Nearest Neighbor Rule , 1967 .

[22]  B. Bhattacharya Application of computational geometry to pattern recognition problems , 1982 .

[23]  V. Barnett The Ordering of Multivariate Data , 1976 .

[24]  Sergey N. Dorogovtsev,et al.  Evolution of Networks: From Biological Nets to the Internet and WWW (Physics) , 2003 .

[25]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[26]  Binay K. Bhattacharya,et al.  Reference set thinning for the k-nearest neighbor decision rule , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[27]  Albert-László Barabási,et al.  Evolution of Networks: From Biological Nets to the Internet and WWW , 2004 .

[28]  Alexander Vezhnevets,et al.  Avoiding Boosting Overfitting by Removing Confusing Samples , 2007, ECML.

[29]  Rm Cameron-Jones,et al.  Instance Selection by Encoding Length Heuristic with Random Mutation Hill Climbing , 1995 .

[30]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[31]  Godfried T. Toussaint,et al.  Relative neighborhood graphs and their relatives , 1992, Proc. IEEE.

[32]  Hugh B. Woodruff,et al.  An algorithm for a selective nearest neighbor decision rule (Corresp.) , 1975, IEEE Trans. Inf. Theory.

[33]  Gunnar Rätsch,et al.  Soft Margins for AdaBoost , 2001, Machine Learning.

[34]  Kaustav Mukherjee,et al.  Application of the Gabriel graph to instance based learning algorithms , 2004 .

[35]  D. Kibler,et al.  Instance-based learning algorithms , 2004, Machine Learning.

[36]  Fabrice Muhlenbach,et al.  Separability Index in Supervised Learning , 2002, PKDD.

[37]  Tony R. Martinez,et al.  Reduction Techniques for Instance-Based Learning Algorithms , 2000, Machine Learning.

[38]  Dennis L. Wilson,et al.  Asymptotic Properties of Nearest Neighbor Rules Using Edited Data , 1972, IEEE Trans. Syst. Man Cybern..

[39]  Alexander Zien,et al.  Semi-Supervised Classification by Low Density Separation , 2005, AISTATS.