A Rayleigh-Ritz style method for large-scale discriminant analysis

Linear Discriminant Analysis (LDA) is one of the most popular approaches for supervised feature extraction and dimension reduction. However, the computation of LDA involves dense matrices eigendecomposition, which is time-consuming for large-scale problems. In this paper, we present a novel algorithm called Rayleigh-Ritz Discriminant Analysis (RRDA) for efficiently solving LDA. While much of the prior research focus on transforming the generalized eigenvalue problem into a least squares formulation, our method is instead based on the well-established Rayleigh-Ritz framework for general eigenvalue problems and seeks to directly solve the generalized eigenvalue problem of LDA. By exploiting the structures in LDA problems, we are able to design customized and highly efficient subspace expansion and extraction strategy for the Rayleigh-Ritz procedure. To reduce the storage requirement and computational complexity of RRDA for high dimensional, low sample size data, we also establish an equivalent reduced model of RRDA. Practical implementations and the convergence result of our method are also discussed. Our experimental results on several real world data sets indicate the performance of the proposed algorithm.

[1]  Dacheng Tao,et al.  On Preserving Original Variables in Bayesian PCA With Application to Image Analysis , 2012, IEEE Transactions on Image Processing.

[2]  Zhi-Hua Zhou,et al.  Least Square Incremental Linear Discriminant Analysis , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[3]  Tao Jiang,et al.  Efficient and robust feature extraction by maximum margin criterion , 2003, IEEE Transactions on Neural Networks.

[4]  Xuelong Li,et al.  Patch Alignment for Dimensionality Reduction , 2009, IEEE Transactions on Knowledge and Data Engineering.

[5]  Yvan Notay,et al.  Combination of Jacobi–Davidson and conjugate gradients for the partial symmetric eigenproblem , 2002, Numer. Linear Algebra Appl..

[6]  Deng Cai,et al.  Probabilistic dyadic data analysis with local and global consistency , 2009, ICML '09.

[7]  Zhihua Zhang,et al.  Regularized Discriminant Analysis, Ridge Regression and Beyond , 2010, J. Mach. Learn. Res..

[8]  Lei-Hong Zhang,et al.  Uncorrelated trace ratio linear discriminant analysis for undersampled problems , 2011, Pattern Recognit. Lett..

[9]  Zhihua Zhang,et al.  A Flexible and Efficient Algorithm for Regularized Fisher Discriminant Analysis , 2009, ECML/PKDD.

[10]  Nojun Kwak,et al.  Kernel discriminant analysis for regression problems , 2012, Pattern Recognit..

[11]  Xuelong Li,et al.  General Tensor Discriminant Analysis and Gabor Features for Gait Recognition , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[13]  Jieping Ye,et al.  Computational and Theoretical Analysis of Null Space and Orthogonal Linear Discriminant Analysis , 2006, J. Mach. Learn. Res..

[14]  Stephen Lin,et al.  Graph Embedding and Extensions: A General Framework for Dimensionality Reduction , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  P. Absil,et al.  A truncated-CG style method for symmetric generalized eigenvalue problems , 2006 .

[16]  Andrew V. Knyazev,et al.  Toward the Optimal Preconditioned Eigensolver: Locally Optimal Block Preconditioned Conjugate Gradient Method , 2001, SIAM J. Sci. Comput..

[17]  Feiping Nie,et al.  Trace Ratio Problem Revisited , 2009, IEEE Transactions on Neural Networks.

[18]  Jieping Ye,et al.  Characterization of a Family of Algorithms for Generalized Discriminant Analysis on Undersampled Problems , 2005, J. Mach. Learn. Res..

[19]  Haesun Park,et al.  Generalizing discriminant analysis using the generalized singular value decomposition , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Xiaoyang Tan,et al.  A study on three linear discriminant analysis based methods in small sample size problem , 2008, Pattern Recognit..

[21]  Gerard L. G. Sleijpen,et al.  Jacobi-Davidson Style QR and QZ Algorithms for the Reduction of Matrix Pencils , 1998, SIAM J. Sci. Comput..

[22]  Yousef Saad,et al.  Trace optimization and eigenproblems in dimension reduction methods , 2011, Numer. Linear Algebra Appl..

[23]  Jian Yang,et al.  KPCA plus LDA: a complete kernel Fisher discriminant framework for feature extraction and recognition , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Jieping Ye,et al.  Least squares linear discriminant analysis , 2007, ICML '07.

[25]  Jieping Ye,et al.  A scalable two-stage approach for a class of dimensionality reduction techniques , 2010, KDD.

[26]  Dacheng Tao,et al.  Max-Min Distance Analysis by Using Sequential SDP Relaxation for Dimension Reduction , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  G. Golub,et al.  On the spectral decomposition of Hermitian matrices modified by low rank perturbations , 1988 .

[28]  Gene H. Golub,et al.  Matrix computations , 1983 .

[29]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[30]  J. Friedman Regularized Discriminant Analysis , 1989 .

[31]  Xiaofei He,et al.  Locality Preserving Projections , 2003, NIPS.

[32]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[33]  Jiawei Han,et al.  Speed up kernel discriminant analysis , 2011, The VLDB Journal.

[34]  Gene H. Golub,et al.  An Inverse Free Preconditioned Krylov Subspace Method for Symmetric Generalized Eigenvalue Problems , 2002, SIAM J. Sci. Comput..

[35]  Yeung Sam Hung,et al.  Characterization of All Solutions for Undersampled Uncorrelated Linear Discriminant Analysis Problems , 2011, SIAM J. Matrix Anal. Appl..

[36]  Henry P. Decell An application of the Cayley-Hamilton theorem to generalized matrix inversion. , 1965 .

[37]  Jiawei Han,et al.  Semi-supervised Discriminant Analysis , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[38]  Jiawei Han,et al.  SRDA: An Efficient Algorithm for Large-Scale Discriminant Analysis , 2008, IEEE Transactions on Knowledge and Data Engineering.

[39]  Xian-Sheng Hua,et al.  Ensemble Manifold Regularization , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Jieping Ye,et al.  Generalized Linear Discriminant Analysis: A Unified Framework and Efficient Model Selection , 2008, IEEE Transactions on Neural Networks.

[41]  Sivan Toledo,et al.  Using Perturbed QR Factorizations to Solve Linear Least-Squares Problems , 2009, SIAM J. Matrix Anal. Appl..

[42]  Xuelong Li,et al.  Geometric Mean for Subspace Selection , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Jian Yang,et al.  Why can LDA be performed in PCA transformed space? , 2003, Pattern Recognit..

[44]  Qin Li,et al.  Orthogonal discriminant vector for face recognition across pose , 2012, Pattern Recognit..

[45]  Yuxiao Hu,et al.  Face recognition using Laplacianfaces , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  Jieping Ye,et al.  A least squares formulation for a class of generalized eigenvalue problems in machine learning , 2009, ICML '09.