On Parzen windows classifiers

Parzen Windows classifiers have been applied to a variety of density estimation as well as classification tasks with considerable success. Parzen Windows are known to converge in the asymptotic limit. However, there is a lack of theoretical analysis on their performance with finite samples. In this paper we show a connection between Parzen Windows and the regularized least squares algorithm, which has a well-established foundation in computational learning theory. This connection allows us to provide useful insight into Parzen Windows classifiers and their performance in finite sample settings. Finally, we show empirical results on the performance of Parzen Windows classifiers using a number of real data sets.

[1]  Tomaso A. Poggio,et al.  Regularization Networks and Support Vector Machines , 2000, Adv. Comput. Math..

[2]  John Shawe-Taylor,et al.  PAC Bayes and Margins , 2003 .

[3]  Jing Peng,et al.  Adaptive Discriminant and Quasiconformal Kernel Nearest Neighbor Classification , 2005 .

[4]  A. N. Tikhonov,et al.  Solutions of ill-posed problems , 1977 .

[5]  T. Poggio,et al.  The Mathematics of Learning: Dealing with Data , 2005, 2005 International Conference on Neural Networks and Brain.

[6]  Dimitrios Gunopulos,et al.  Large margin nearest neighbor classifiers , 2005, IEEE Transactions on Neural Networks.

[7]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[8]  Dimitrios Gunopulos,et al.  Locally Adaptive Metric Nearest-Neighbor Classification , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Robert Tibshirani,et al.  Discriminant Adaptive Nearest Neighbor Classification , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[11]  S. Smale,et al.  Shannon sampling II: Connections to learning theory , 2005 .

[12]  Slobodan Vucetic,et al.  An Active Learning Algorithm Based on Parzen Window Classication , 2011 .

[13]  Weifeng Liu,et al.  Adaptive and Learning Systems for Signal Processing, Communication, and Control , 2010 .

[14]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[15]  A. E. Hoerl,et al.  Ridge regression: biased estimation for nonorthogonal problems , 2000 .

[16]  Bernhard Schölkopf,et al.  Generalized Regularized Least-Squares Learning with Predefined Features in a Hilbert Space , 2007 .

[17]  Gunnar Rätsch,et al.  Efficient Margin Maximizing with Boosting , 2005, J. Mach. Learn. Res..

[18]  Jing Peng,et al.  Feature relevance learning with query shifting for content-based image retrieval , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[19]  Felipe Cucker,et al.  Best Choices for Regularization Parameters in Learning Theory: On the Bias—Variance Problem , 2002, Found. Comput. Math..

[20]  Anders Krogh,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[21]  James V. Candy,et al.  Adaptive and Learning Systems for Signal Processing, Communications, and Control , 2006 .

[22]  Yann Guermeur,et al.  VC Theory of Large Margin Multi-Category Classifiers , 2007, J. Mach. Learn. Res..