Learning by local kernel polarization

The problem of evaluating the quality of a kernel function for a classification task is considered. Drawn from physics, kernel polarization was introduced as an effective measure for selecting kernel parameters, which was previously done mostly by exhaustive search. However, it only takes between-class separability into account but neglects the preservation of within-class local structure. The 'globality' of the kernel polarization may leave less degree of freedom for increasing separability. In this paper, we propose a new quality measure called local kernel polarization, which is a localized variant of kernel polarization. Local kernel polarization can preserve the local structure of the data of the same class so the data can be embedded more appropriately. This quality measure is demonstrated with some UCI machine learning benchmark examples.

[1]  Yoram Baram,et al.  Learning by Kernel Polarization , 2005, Neural Computation.

[2]  Nello Cristianini,et al.  Learning the Kernel Matrix with Semidefinite Programming , 2002, J. Mach. Learn. Res..

[3]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[4]  Liefeng Bo,et al.  Feature Scaling for Kernel Fisher Discriminant Analysis Using Leave-One-Out Cross Validation , 2006 .

[5]  Chih-Jen Lin,et al.  Radius Margin Bounds for Support Vector Machines with the RBF Kernel , 2002, Neural Computation.

[6]  Masashi Sugiyama,et al.  Local Fisher discriminant analysis for supervised dimensionality reduction , 2006, ICML.

[7]  Christian Igel,et al.  Gradient-Based Adaptation of General Gaussian Kernels , 2005, Neural Computation.

[8]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2004 .

[9]  N. Cristianini,et al.  On Kernel-Target Alignment , 2001, NIPS.

[10]  Tu Bao Ho,et al.  Kernel Matrix Evaluation , 2007, IJCAI.

[11]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[12]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[13]  Cédric Richard,et al.  A greedy algorithm for optimizing the kernel alignment and the performance of kernel machines , 2006, 2006 14th European Signal Processing Conference.

[14]  S. Sathiya Keerthi,et al.  Evaluation of simple performance measures for tuning SVM hyperparameters , 2003, Neurocomputing.

[15]  Chih-Jen Lin,et al.  Asymptotic Behaviors of Support Vector Machines with Gaussian Kernel , 2003, Neural Computation.

[16]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[17]  Sayan Mukherjee,et al.  Choosing Multiple Parameters for Support Vector Machines , 2002, Machine Learning.

[18]  Christian Igel,et al.  Gradient-Based Optimization of Kernel-Target Alignment for Sequence Kernels Applied to Bacterial Gene Start Detection , 2007, IEEE ACM Trans. Comput. Biol. Bioinform..

[19]  Gunnar Rätsch,et al.  An introduction to kernel-based learning algorithms , 2001, IEEE Trans. Neural Networks.

[20]  Xiaofei He,et al.  Locality Preserving Projections , 2003, NIPS.

[21]  Christian Igel,et al.  Evolutionary tuning of multiple SVM parameters , 2005, ESANN.

[22]  George Eastman House,et al.  Sparse Bayesian Learning and the Relevance Vector Machine , 2001 .

[23]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.