Speed up kernel discriminant analysis

Linear discriminant analysis (LDA) has been a popular method for dimensionality reduction, which preserves class separability. The projection vectors are commonly obtained by maximizing the between-class covariance and simultaneously minimizing the within-class covariance. LDA can be performed either in the original input space or in the reproducing kernel Hilbert space (RKHS) into which data points are mapped, which leads to kernel discriminant analysis (KDA). When the data are highly nonlinear distributed, KDA can achieve better performance than LDA. However, computing the projective functions in KDA involves eigen-decomposition of kernel matrix, which is very expensive when a large number of training samples exist. In this paper, we present a new algorithm for kernel discriminant analysis, called Spectral Regression Kernel Discriminant Analysis (SRKDA). By using spectral graph analysis, SRKDA casts discriminant analysis into a regression framework, which facilitates both efficient computation and the use of regularization techniques. Specifically, SRKDA only needs to solve a set of regularized regression problems, and there is no eigenvector computation involved, which is a huge save of computational cost. The new formulation makes it very easy to develop incremental version of the algorithm, which can fully utilize the computational results of the existing training samples. Moreover, it is easy to produce sparse projections (Sparse KDA) with a L1-norm regularizer. Extensive experiments on spoken letter, handwritten digit image and face image data demonstrate the effectiveness and efficiency of the proposed algorithm.

[1]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[2]  Thomas G. Dietterich,et al.  Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..

[3]  B. Scholkopf,et al.  Fisher discriminant analysis with kernels , 1999, Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No.98TH8468).

[4]  Jiawei Han,et al.  SRDA: An Efficient Algorithm for Large-Scale Discriminant Analysis , 2008, IEEE Transactions on Knowledge and Data Engineering.

[5]  Shai Avidan,et al.  Generalized spectral bounds for sparse LDA , 2006, ICML.

[6]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[7]  Jiawei Han,et al.  Spectral regression: a unified subspace learning framework for content-based image retrieval , 2007, ACM Multimedia.

[8]  Heng Tao Shen,et al.  Indexing Text and Visual Features for WWW Images , 2005, APWeb.

[9]  Gene H. Golub,et al.  Matrix computations , 1983 .

[10]  Haesun Park,et al.  Nonlinear Discriminant Analysis Using Kernel Functions and the Generalized Singular Value Decomposition , 2005, SIAM J. Matrix Anal. Appl..

[11]  Jiawei Han,et al.  Spectral Regression: A Unified Approach for Sparse Subspace Learning , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[12]  Jiawei Han,et al.  Efficient Kernel Discriminant Analysis via Spectral Regression , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[13]  Jiawei Han,et al.  Regularized locality preserving indexing via spectral regression , 2007, CIKM '07.

[14]  Jonathan J. Hull,et al.  A Database for Handwritten Text Recognition Research , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[16]  Aoying Zhou,et al.  An adaptive and dynamic dimensionality reduction method for high-dimensional indexing , 2007, The VLDB Journal.

[17]  Xuelong Li,et al.  Direct kernel biased discriminant analysis: a new content-based image retrieval relevance feedback algorithm , 2006, IEEE Transactions on Multimedia.

[18]  G. Stewart Matrix Algorithms, Volume II: Eigensystems , 2001 .

[19]  Sharad Mehrotra,et al.  Local Dimensionality Reduction: A New Approach to Indexing High Dimensional Spaces , 2000, VLDB.

[20]  Beng Chin Ooi,et al.  Towards effective indexing for very large video sequence database , 2005, SIGMOD '05.

[21]  Xuelong Li,et al.  General Tensor Discriminant Analysis and Gabor Features for Gait Recognition , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Ronald A. Cole,et al.  Spoken Letter Recognition , 1990, HLT.

[23]  Thomas Sauer Algebraic Aspects of Polynomial Interpolation in Several Variables , 1998 .

[24]  G. W. Stewart,et al.  Matrix Algorithms: Volume 1, Basic Decompositions , 1998 .

[25]  Bernhard Schölkopf,et al.  An improved training algorithm for kernel Fisher discriminants , 2001, AISTATS.

[26]  Shai Avidan,et al.  Spectral Bounds for Sparse PCA: Exact and Greedy Algorithms , 2005, NIPS.

[27]  Xuelong Li,et al.  Geometric Mean for Subspace Selection , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[29]  G. Baudat,et al.  Generalized Discriminant Analysis Using a Kernel Approach , 2000, Neural Computation.

[30]  Beng Chin Ooi,et al.  An adaptive and efficient dimensionality reduction algorithm for high-dimensional indexing , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[31]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[32]  Michael I. Jordan,et al.  A Direct Formulation for Sparse Pca Using Semidefinite Programming , 2004, NIPS 2004.

[33]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[34]  Oliver Günther,et al.  Multidimensional access methods , 1998, CSUR.

[35]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.