论文信息 - A support vector machine formulation to PCA analysis and its kernel version

A support vector machine formulation to PCA analysis and its kernel version

In this paper, we present a simple and straightforward primal-dual support vector machine formulation to the problem of principal component analysis (PCA) in dual variables. By considering a mapping to a high-dimensional feature space and application of the kernel trick (Mercer theorem), kernel PCA is obtained as introduced by Scholkopf et al. (2002). While least squares support vector machine classifiers have a natural link with the kernel Fisher discriminant analysis (minimizing the within class scatter around targets +1 and -1), for PCA analysis one can take the interpretation of a one-class modeling problem with zero target value around which one maximizes the variance. The score variables are interpreted as error variables within the problem formulation. In this way primal-dual constrained optimization problem interpretations to the linear and kernel PCA analysis are obtained in a similar style as for least square-support vector machine classifiers.

[1] Tomaso A. Poggio,et al. Regularization Networks and Support Vector Machines , 2000, Adv. Comput. Math..

[2] G. Baudat,et al. Generalized Discriminant Analysis Using a Kernel Approach , 2000, Neural Computation.

[3] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[4] Bernhard Schölkopf,et al. Learning with kernels , 2001 .

[5] Johan A. K. Suykens,et al. Bayesian Framework for Least-Squares Support Vector Machine Classifiers, Gaussian Processes, and Kernel Fisher Discriminant Analysis , 2002, Neural Computation.

[6] Heekuck Oh,et al. Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[7] Mark Girolami,et al. Orthogonal Series Density Estimation and the Kernel Eigenvalue Problem , 2002, Neural Computation.

[8] Harold Hotelling,et al. Simplified calculation of principal components , 1936 .

[9] Johan A. K. Suykens,et al. Optimal control by least squares support vector machines , 2001, Neural Networks.

[10] B. Scholkopf,et al. Fisher discriminant analysis with kernels , 1999, Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No.98TH8468).

[11] J. Gower. Some distance properties of latent root and vector methods used in multivariate analysis , 1966 .

[12] Alexander Gammerman,et al. Ridge Regression Learning Algorithm in Dual Variables , 1998, ICML.

[13] Matthias W. Seeger,et al. Using the Nyström Method to Speed Up Kernel Machines , 2000, NIPS.

[14] G. Wahba. Spline models for observational data , 1990 .

[15] Gene H. Golub,et al. Matrix computations , 1983 .

[16] H. Hotelling. Relations Between Two Sets of Variates , 1936 .

[17] Johan A. K. Suykens,et al. Weighted least squares support vector machines: robustness and sparse approximation , 2002, Neurocomputing.

[18] Johan A. K. Suykens,et al. Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[19] Carl E. Rasmussen,et al. In Advances in Neural Information Processing Systems , 2011 .

[20] F. Girosi,et al. Networks for approximation and learning , 1990, Proc. IEEE.

[21] Bernhard Schölkopf,et al. Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[22] Keinosuke Fukunaga,et al. Introduction to Statistical Pattern Recognition , 1972 .

[23] Sun-Yuan Kung,et al. Principal Component Neural Networks: Theory and Applications , 1996 .

[24] Gunnar Rätsch,et al. Input space versus feature space in kernel-based methods , 1999, IEEE Trans. Neural Networks.

[25] Karl Pearson F.R.S.. LIII. On lines and planes of closest fit to systems of points in space , 1901 .