Fisher discriminant analysis with kernels

A non-linear classification technique based on Fisher's discriminant is proposed. The main ingredient is the kernel trick which allows the efficient computation of Fisher discriminant in feature space. The linear classification in feature space corresponds to a (powerful) non-linear decision function in input space. Large scale simulations demonstrate the competitiveness of our approach.

[1]  M. Aizerman,et al.  Theoretical Foundations of the Potential Function Method in Pattern Recognition Learning , 1964 .

[2]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[3]  Saburou Saitoh,et al.  Theory of Reproducing Kernels and Its Applications , 1988 .

[4]  J. Friedman Regularized Discriminant Analysis , 1989 .

[5]  G. McLachlan Discriminant Analysis and Statistical Pattern Recognition , 1992 .

[6]  R. Tibshirani,et al.  Flexible Discriminant Analysis by Optimal Scoring , 1994 .

[7]  Yoshua Bengio,et al.  Pattern Recognition and Neural Networks , 1995 .

[8]  Christopher J. C. Burges,et al.  Simplified Support Vector Decision Rules , 1996, ICML.

[9]  R. Tibshirani,et al.  Discriminant Analysis by Gaussian Mixtures , 1996 .

[10]  Bernhard Schölkopf,et al.  Support vector learning , 1997 .

[11]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[12]  Simon Tong,et al.  Bayes Optimal Hyperplanes ! Maximal Margin Hyperplanes , 1999 .

[13]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[14]  B. Schölkopf,et al.  Advances in kernel methods: support vector learning , 1999 .

[15]  Gunnar Rätsch,et al.  Input space versus feature space in kernel-based methods , 1999, IEEE Trans. Neural Networks.

[16]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.