Kernel PLS-SVC for Linear and Nonlinear Classification

A new method for classification is proposed. This is based on kernel orthonormalized partial least squares (PLS) dimensionality reduction of the original data space followed by a support vector classifier. Unlike principal component analysis (PCA), which has previously served as a dimension reduction step for discrimination problems, orthonormalized PLS is closely related to Fisher's approach to linear discrimination or equivalently to canonical correlation analysis. For this reason orthonormalized PLS is preferable to PCA for discrimination. Good behavior of the proposed method is demonstrated on 13 different benchmark data sets and on the real world problem of classifying finger movement periods from non-movement periods based on electroencephalograms.

[1]  N. L. Johnson,et al.  Multivariate Analysis , 1958, Nature.

[2]  H. Vinod Canonical ridge and econometrics of joint production , 1976 .

[3]  S. Wold,et al.  The Collinearity Problem in Linear Regression. The Partial Least Squares (PLS) Approach to Generalized Inverses , 1984 .

[4]  A. Höskuldsson PLS regression methods , 1988 .

[5]  F. Bookstein,et al.  Neurobehavioral effects of prenatal alcohol: Part I. Research strategy. , 1989, Neurotoxicology and teratology.

[6]  F. Bookstein,et al.  Neurobehavioral effects of prenatal alcohol: Part II. Partial least squares analysis. , 1989, Neurotoxicology and teratology.

[7]  Paul J. Lewi,et al.  Pattern recognition, reflections from a chemometric point of view , 1995 .

[8]  K J Worsley,et al.  An overview and some new developments in the statistical analysis of PET and fMRI data , 1997, Human brain mapping.

[9]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[10]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[11]  B. Scholkopf,et al.  Fisher discriminant analysis with kernels , 1999, Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No.98TH8468).

[12]  Jacob A. Wegelin,et al.  A Survey of Partial Least Squares (PLS) Methods, with Emphasis on the Two-Block Case , 2000 .

[13]  Colin Fyfe,et al.  Kernel and Nonlinear Canonical Correlation Analysis , 2000, IJCNN.

[14]  B. M. Wise,et al.  Canonical partial least squares and continuum power regression , 2001 .

[15]  Roman Rosipal,et al.  Kernel Partial Least Squares Regression in Reproducing Kernel Hilbert Space , 2002, J. Mach. Learn. Res..

[16]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[17]  L.J. Trejo,et al.  Multimodal neuroelectric interface development , 2013, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[18]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[19]  M. Barker,et al.  Partial least squares for discrimination , 2003 .

[20]  Gunnar Rätsch,et al.  Soft Margins for AdaBoost , 2001, Machine Learning.