K-Local Hyperplane and Convex Distance Nearest Neighbor Algorithms

Guided by an initial idea of building a complex (non linear) decision surface with maximal local margin in input space, we give a possible geometrical intuition as to why K-Nearest Neighbor (KNN) algorithms often perform more poorly than SVMs on classification tasks. We then propose modified K-Nearest Neighbor algorithms to overcome the perceived problem. The approachis similar in spirit to Tangent Distance, but with invariances inferred from the local neighborhood rather than prior knowledge. Experimental results on real world classification tasks suggest that the modified KNN algorithms often give a dramatic improvement over standard KNN and perform as well or better than SVMs.

[1]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[2]  Keinosuke Fukunaga,et al.  The optimal distance measure for nearest neighbor classification , 1981, IEEE Trans. Inf. Theory.

[3]  David J. Hand,et al.  The multi-class metric problem in nearest neighbour discrimination rules , 1990, Pattern Recognit..

[4]  Léon Bottou,et al.  Local Learning Algorithms , 1992, Neural Computation.

[5]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[6]  Jerome H. Friedman,et al.  Flexible Metric Nearest Neighbor Classification , 1994 .

[7]  Vladimir Vapnik,et al.  The Nature of Statistical Learning , 1995 .

[8]  Robert Tibshirani,et al.  Discriminant Adaptive Nearest Neighbor Classification and Regression , 1995, NIPS.

[9]  Yann LeCun,et al.  Transformation Invariance in Pattern Recognition-Tangent Distance and Tangent Propagation , 1996, Neural Networks: Tricks of the Trade.

[10]  Trevor J. Hastie,et al.  Optimal Kernel Shapes for Local Linear Regression , 1999, NIPS.

[11]  Stan Z. Li,et al.  Face recognition using the nearest feature line method , 1999, IEEE Trans. Neural Networks.

[12]  Daphne Koller,et al.  Restricted Bayes Optimal Classifiers , 2000, AAAI/IAAI.

[13]  Jason Weston,et al.  Vicinal Risk Minimization , 2000, NIPS.

[14]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[15]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[16]  Beichuan Zhang,et al.  Is the Maximal Margin Hyperplane Special in a Feature Space? Hewlett-Packard Labs Tech Report HPL-20 , 2001 .