SRDA: An Efficient Algorithm for Large-Scale Discriminant Analysis

Linear Discriminant Analysis (LDA) has been a popular method for extracting features that preserves class separability. The projection functions of LDA are commonly obtained by maximizing the between-class covariance and simultaneously minimizing the within-class covariance. It has been widely used in many fields of information processing, such as machine learning, data mining, information retrieval, and pattern recognition. However, the computation of LDA involves dense matrices eigendecomposition, which can be computationally expensive in both time and memory. Specifically, LDA has O(mnt + t3) time complexity and requires O(mn + mt + nt) memory, where m is the number of samples, n is the number of features, and t = min(m,n). When both m and n are large, it is infeasible to apply LDA. In this paper, we propose a novel algorithm for discriminant analysis, called Spectral Regression Discriminant Analysis (SRDA). By using spectral graph analysis, SRDA casts discriminant analysis into a regression framework that facilitates both efficient computation and the use of regularization techniques. Specifically, SRDA only needs to solve a set of regularized least squares problems, and there is no eigenvector computation involved, which is a huge save of both time and memory. Our theoretical analysis shows that SRDA can be computed with O(mn) time and O(ms) memory, where .s(les n) is the average number of nonzero features in each sample. Extensive experimental results on four real-world data sets demonstrate the effectiveness and efficiency of our algorithm.

[1]  Hans C. van Houwelingen,et al.  The Elements of Statistical Learning, Data Mining, Inference, and Prediction. Trevor Hastie, Robert Tibshirani and Jerome Friedman, Springer, New York, 2001. No. of pages: xvi+533. ISBN 0‐387‐95284‐5 , 2004 .

[2]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[3]  Kohji Fukunaga,et al.  Introduction to Statistical Pattern Recognition-Second Edition , 1990 .

[4]  Jiawei Han,et al.  Spectral regression: a unified subspace learning framework for content-based image retrieval , 2007, ACM Multimedia.

[5]  G. Stewart Matrix Algorithms, Volume II: Eigensystems , 2001 .

[6]  Michael A. Saunders,et al.  LSQR: An Algorithm for Sparse Linear Equations and Sparse Least Squares , 1982, TOMS.

[7]  Ken Lang,et al.  NewsWeeder: Learning to Filter Netnews , 1995, ICML.

[8]  Fan Chung,et al.  Spectral Graph Theory , 1996 .

[9]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[10]  Juyang Weng,et al.  Using Discriminant Eigenfeatures for Image Retrieval , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Jiawei Han,et al.  Spectral Regression: A Unified Approach for Sparse Subspace Learning , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[12]  David G. Stork,et al.  Pattern Classification , 1973 .

[13]  Yuxiao Hu,et al.  Face recognition using Laplacianfaces , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  J. Friedman Regularized Discriminant Analysis , 1989 .

[15]  Jiawei Han,et al.  Regularized locality preserving indexing via spectral regression , 2007, CIKM '07.

[16]  Hui Xiong,et al.  IDR/QR: an incremental dimension reduction algorithm via QR decomposition , 2004, IEEE Transactions on Knowledge and Data Engineering.

[17]  Jieping Ye,et al.  Characterization of a Family of Algorithms for Generalized Discriminant Analysis on Undersampled Problems , 2005, J. Mach. Learn. Res..

[18]  F. Chung Spectral Graph Theory, Regional Conference Series in Math. , 1997 .

[19]  Kari Torkkola,et al.  Linear Discriminant Analysis in Document Classification , 2007 .

[20]  Jiawei Han,et al.  Efficient Kernel Discriminant Analysis via Spectral Regression , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[21]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[22]  Haesun Park,et al.  Generalizing discriminant analysis using the generalized singular value decomposition , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Jieping Ye,et al.  Regularized discriminant analysis for high dimensional, low sample size data , 2006, KDD '06.

[24]  G. W. Stewart,et al.  Matrix Algorithms: Volume 1, Basic Decompositions , 1998 .

[25]  Jiawei Han,et al.  Spectral Regression for Efficient Regularized Subspace Learning , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[26]  R. Penrose A Generalized inverse for matrices , 1955 .

[27]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[28]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[29]  Oliver Günther,et al.  Multidimensional access methods , 1998, CSUR.

[30]  Jieping Ye,et al.  Least squares linear discriminant analysis , 2007, ICML '07.

[31]  Michael A. Saunders,et al.  Algorithm 583: LSQR: Sparse Linear Equations and Least Squares Problems , 1982, TOMS.

[32]  Gene H. Golub,et al.  Matrix computations , 1983 .

[33]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[34]  R. Tibshirani,et al.  Penalized Discriminant Analysis , 1995 .