On Error Correlation and Accuracy of Nearest Neighbor Ensemble Classifiers

Recent empirical work has shown that combining predictors can lead to significant reduction in generalization error. Unfortunately, many combining methods do not improve nearest neighbor (NN) classifiers at all. This is because NN methods are very robust with respect to variations of a data set. In contrast, they are sensitive to input features. We exploit the instability of NN classifiers with respect to different choices of features to generate an effective and diverse set of NN classifiers. Interestingly, the approach takes advantage of the high dimensionality of the data. We investigate techniques to decorrelate errors while keeping the individual classifiers accurate. We analyze the results both in terms of error rates and error correlations. The experimental results show that our technique can offer significant performance improvements with respect to competitive methods.

[1]  Kagan Tumer,et al.  Error Correlation and Error Reduction in Ensemble Classifiers , 1996, Connect. Sci..

[2]  Michael J. Pazzani,et al.  Error reduction through learning multiple descriptions , 2004, Machine Learning.

[3]  Stephen E. Fienberg,et al.  Discrete Multivariate Analysis: Theory and Practice , 1976 .

[4]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[5]  P. Langley,et al.  Average-case analysis of a nearest neighbor algorthim , 1993, IJCAI 1993.

[6]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[7]  Ron Kohavi,et al.  Bias Plus Variance Decomposition for Zero-One Loss Functions , 1996, ICML.

[8]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[9]  J. Ross Quinlan,et al.  Bagging, Boosting, and C4.5 , 1996, AAAI/IAAI, Vol. 1.

[10]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[11]  Stephen D. Bay Nearest neighbor classification from multiple feature subsets , 1999, Intell. Data Anal..

[12]  Salvatore J. Stolfo,et al.  A Comparative Evaluation of Voting and Meta-learning on Partitioned Data , 1995, ICML.

[13]  Thomas G. Dietterich,et al.  Pruning Adaptive Boosting , 1997, ICML.

[14]  Dimitrios Gunopulos,et al.  Locally Adaptive Metric Nearest-Neighbor Classification , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Leo Breiman,et al.  Prediction Games and Arcing Algorithms , 1999, Neural Computation.

[16]  Robert Tibshirani,et al.  Discriminant Adaptive Nearest Neighbor Classification , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Tin Kam Ho,et al.  Nearest Neighbors in Random Subspaces , 1998, SSPR/SPR.

[18]  Alan Agresti,et al.  Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.

[19]  Dimitrios Gunopulos,et al.  Adaptive Nearest Neighbor Classification Using Support Vector Machines , 2001, NIPS.

[20]  Lars Kai Hansen,et al.  Neural Network Ensembles , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  Jerome H. Friedman,et al.  Flexible Metric Nearest Neighbor Classification , 1994 .

[22]  Pat Langley,et al.  Average-Case Analysis of a Nearest Neighbor Algorithm , 1993, IJCAI.