Nearest neighbor ensemble

Recent empirical work has shown that combining predictors can lead to significant reduction in generalization error. The individual predictors (weak learners) can be very simple, such as two terminal-node trees; it is the aggregating scheme that gives them the power of increasing prediction accuracy. Unfortunately, many combining methods do not improve nearest neighbor (NN) classifiers at all. This is because NN methods are very robust with respect to variations of a data set. In contrast, they are sensitive to input features. We exploit the instability of NN classifiers with respect to different choices of features to generate an effective and diverse set of NN classifiers with possibly uncorrelated errors. Interestingly, the approach takes advantage of the high dimensionality of the data. The experimental results show that our technique offers significant performance improvements with respect to competitive methods.

[1]  Dimitrios Gunopulos,et al.  Locally Adaptive Metric Nearest-Neighbor Classification , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Leo Breiman,et al.  Prediction Games and Arcing Algorithms , 1999, Neural Computation.

[3]  Robert Tibshirani,et al.  Discriminant Adaptive Nearest Neighbor Classification , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  M. Pazzani,et al.  Error Reduction through Learning Multiple Descriptions , 1996, Machine Learning.

[5]  Dimitrios Gunopulos,et al.  Adaptive Nearest Neighbor Classification Using Support Vector Machines , 2001, NIPS.

[6]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[7]  A. Agresti,et al.  Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.

[8]  J. Ross Quinlan,et al.  Bagging, Boosting, and C4.5 , 1996, AAAI/IAAI, Vol. 1.

[9]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[10]  Ron Kohavi,et al.  Bias Plus Variance Decomposition for Zero-One Loss Functions , 1996, ICML.

[11]  Stephen D. Bay Nearest neighbor classification from multiple feature subsets , 1999, Intell. Data Anal..

[12]  Lars Kai Hansen,et al.  Neural Network Ensembles , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[14]  Salvatore J. Stolfo,et al.  A Comparative Evaluation of Voting and Meta-learning on Partitioned Data , 1995, ICML.

[15]  Thomas G. Dietterich,et al.  Pruning Adaptive Boosting , 1997, ICML.