Training Linear Discriminant Analysis in Linear Time

Linear Discriminant Analysis (LDA) has been a popular method for extracting features which preserve class separability. It has been widely used in many fields of information processing, such as machine learning, data mining, information retrieval, and pattern recognition. However, the computation of LDA involves dense matrices eigen-decomposition which can be computationally expensive both in time and memory. Specifically, LDA has O(mnt + t3) time complexity and requires O(mn + mt + nt) memory, where m is the number of samples, n is the number of features and t = min (m,n). When both m and n are large, it is infeasible to apply LDA. In this paper, we propose a novel algorithm for discriminant analysis, called Spectral Regression Discriminant Analysis (SRDA). By using spectral graph analysis, SRDA casts discriminant analysis into a regression framework which facilitates both efficient computation and the use of regularization techniques. Our theoretical analysis shows that SRDA can be computed with O(ms) time and O(ms) memory, where s(les n) is the average number of non-zero features in each sample. Extensive experimental results on four real world data sets demonstrate the effectiveness and efficiency of our algorithm.

[1]  Jiawei Han,et al.  SRDA: An Efficient Algorithm for Large-Scale Discriminant Analysis , 2008, IEEE Transactions on Knowledge and Data Engineering.

[2]  G. W. Stewart,et al.  Matrix Algorithms: Volume 1, Basic Decompositions , 1998 .

[3]  Oliver Günther,et al.  Multidimensional access methods , 1998, CSUR.

[4]  Michael A. Saunders,et al.  Algorithm 583: LSQR: Sparse Linear Equations and Least Squares Problems , 1982, TOMS.

[5]  G. Stewart Matrix Algorithms, Volume II: Eigensystems , 2001 .

[6]  Jiawei Han,et al.  Efficient Kernel Discriminant Analysis via Spectral Regression , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[7]  Michael A. Saunders,et al.  LSQR: An Algorithm for Sparse Linear Equations and Sparse Least Squares , 1982, TOMS.

[8]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[9]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[10]  Yuxiao Hu,et al.  Face recognition using Laplacianfaces , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Jiawei Han,et al.  Spectral Regression for Efficient Regularized Subspace Learning , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[12]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[13]  Jiawei Han,et al.  Spectral regression: a unified subspace learning framework for content-based image retrieval , 2007, ACM Multimedia.

[14]  Kari Torkkola,et al.  Linear Discriminant Analysis in Document Classification , 2007 .

[15]  J. Friedman Regularized Discriminant Analysis , 1989 .

[16]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[17]  Jiawei Han,et al.  Regularized locality preserving indexing via spectral regression , 2007, CIKM '07.

[18]  Hui Xiong,et al.  IDR/QR: an incremental dimension reduction algorithm via QR decomposition , 2004, IEEE Transactions on Knowledge and Data Engineering.

[19]  Gene H. Golub,et al.  Matrix computations (3rd ed.) , 1996 .

[20]  Ken Lang,et al.  NewsWeeder: Learning to Filter Netnews , 1995, ICML.

[21]  Pavel Pudil,et al.  Introduction to Statistical Pattern Recognition , 2006 .

[22]  Jiawei Han,et al.  Spectral Regression: A Unified Approach for Sparse Subspace Learning , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).