Sets of approximating functions with finite Vapnik-Chervonenkis dimension for nearest-neighbors algorithms

According to a certain misconception sometimes met in the literature: for the nearest-neighbors algorithms there is no fixed hypothesis class of limited Vapnik-Chervonenkis dimension. In the paper a simple reformulation (not a modification) of the nearest-neighbors algorithm is shown where instead of a natural number k, a percentage @[email protected]?(0,1) of nearest neighbors is used. Owing to this reformulation one can construct sets of approximating functions, which we prove to have finite VC dimension. In a special (but practical) case this dimension is equal to @?2/@[email protected]?. It is also then possible to form a sequence of sets of functions with increasing VC dimension, and to perform complexity selection via cross-validation or similarly to the structural risk minimization framework. Results of such experiments are also presented.

[1]  Pascal Vincent,et al.  K-Local Hyperplane and Convex Distance Nearest Neighbor Algorithms , 2001, NIPS.

[2]  Vladimir Cherkassky,et al.  Learning from data , 1998 .

[3]  Jon Louis Bentley,et al.  An Algorithm for Finding Best Matches in Logarithmic Expected Time , 1977, TOMS.

[4]  S. T. Buckland,et al.  Computer Intensive Statistical Methods: Validation, Model Selection, and Bootstrap , 1993 .

[5]  John Shawe-Taylor,et al.  A framework for structural risk minimisation , 1996, COLT '96.

[6]  Przemyslaw Klesk,et al.  Maximal Margin Estimation with Perceptron-Like Algorithm , 2008, ICAISC.

[7]  Peter L. Bartlett,et al.  Rademacher and Gaussian Complexities: Risk Bounds and Structural Results , 2003, J. Mach. Learn. Res..

[8]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[9]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[10]  Ronald L. Graham,et al.  Concrete mathematics - a foundation for computer science , 1991 .

[11]  Dana Ron,et al.  Algorithmic Stability and Sanity-Check Bounds for Leave-One-Out Cross-Validation , 1997, Neural Computation.

[12]  Peter L. Bartlett,et al.  Neural Network Learning - Theoretical Foundations , 1999 .

[13]  Teuvo Kohonen,et al.  The self-organizing map , 1990 .

[14]  Martin E. Hellman,et al.  Probability of error, equivocation, and the Chernoff bound , 1970, IEEE Trans. Inf. Theory.

[15]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[16]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[17]  Yann LeCun,et al.  Transformation Invariance in Pattern Recognition - Tangent Distance and Tangent Propagation , 2012, Neural Networks: Tricks of the Trade.

[18]  Bernd Jähne,et al.  Practical handbook on image processing for scientific and technical applications , 2004 .

[19]  László Györfi,et al.  A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.

[20]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[21]  Bernard Victorri,et al.  Transformation invariance in pattern recognition: Tangent distance and propagation , 2000 .

[22]  Weifeng Liu,et al.  Adaptive and Learning Systems for Signal Processing, Communication, and Control , 2010 .