Fast Fisher discriminant analysis with randomized algorithms

Abstract Fisher discriminant analysis is a classical method for classification and dimension reduction jointly. Regularized FDA (RFDA) and kernel FAD (KFDA) are two important variants. However, RFDA will get stuck in computational burden due to either the high dimension of data or the big number of data and KFDA has similar computational burden due to kernel operations. We propose fast FDA algorithms based on random projection and random feature map to accelerate FDA and kernel FDA. We give theoretical guarantee that the fast FDA algorithms using random projection have good generalization ability in comparison with the conventional regularized FDA. We also give a theoretical guarantee that the pseudoinverse FDA based on random feature map can share similar generalization ability with the conventional kernel FDA. Experimental results further validate that our methods are powerful.

[1]  Zhihua Zhang,et al.  Regularized Discriminant Analysis, Ridge Regression and Beyond , 2010, J. Mach. Learn. Res..

[2]  Avinash C. Kak,et al.  PCA versus LDA , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  S. Dudoit,et al.  Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data , 2002 .

[4]  Benjamin Recht,et al.  Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[5]  Michael W. Mahoney Randomized Algorithms for Matrices and Data , 2011, Found. Trends Mach. Learn..

[6]  Dennis DeCoste,et al.  Compact Random Feature Maps , 2013, ICML.

[7]  Po-Sen Huang,et al.  Random features for Kernel Deep Convex Network , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[8]  David P. Woodruff Sketching as a Tool for Numerical Linear Algebra , 2014, Found. Trends Theor. Comput. Sci..

[9]  S. Muthukrishnan,et al.  Faster least squares approximation , 2007, Numerische Mathematik.

[10]  Huy L. Nguyen,et al.  OSNAP: Faster Numerical Linear Algebra Algorithms via Sparser Subspace Embeddings , 2012, 2013 IEEE 54th Annual Symposium on Foundations of Computer Science.

[11]  David P. Woodruff,et al.  Optimal Approximate Matrix Product in Terms of Stable Rank , 2015, ICALP.

[12]  Qi Wang,et al.  Hyperspectral Band Selection by Multitask Sparsity Pursuit , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[13]  Alexander J. Smola,et al.  Fastfood - Computing Hilbert Space Expansions in loglinear time , 2013, ICML.

[14]  Anirban Dasgupta,et al.  Feature selection methods for text classification , 2007, KDD '07.

[15]  Nathan Halko,et al.  Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions , 2009, SIAM Rev..

[16]  Bernhard Schölkopf,et al.  Randomized Nonlinear Component Analysis , 2014, ICML.

[17]  Jieping Ye,et al.  A two-stage linear discriminant analysis via QR-decomposition , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  G. Baudat,et al.  Generalized Discriminant Analysis Using a Kernel Approach , 2000, Neural Computation.

[19]  David P. Woodruff,et al.  Fast approximation of matrix coherence and statistical leverage , 2011, ICML.

[20]  Volker Roth,et al.  Nonlinear Discriminant Analysis Using Kernel Functions , 1999, NIPS.

[21]  Haesun Park,et al.  Nonlinear Discriminant Analysis Using Kernel Functions and the Generalized Singular Value Decomposition , 2005, SIAM J. Matrix Anal. Appl..

[22]  Joel A. Tropp,et al.  Improved Analysis of the subsampled Randomized Hadamard Transform , 2010, Adv. Data Sci. Adapt. Anal..

[23]  Shourya Roy,et al.  Fast and accurate text classification via multiple linear discriminant projections , 2003, The VLDB Journal.

[24]  George Bebis,et al.  Face recognition experiments with random projection , 2005, SPIE Defense + Commercial Sensing.

[25]  David P. Woodruff,et al.  Low rank approximation and regression in input sparsity time , 2013, STOC '13.

[26]  Heikki Mannila,et al.  Random projection in dimensionality reduction: applications to image and text data , 2001, KDD '01.

[27]  Zhihua Zhang,et al.  Making Fisher Discriminant Analysis Scalable , 2014, ICML.

[28]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[29]  Joel A. Tropp,et al.  An Introduction to Matrix Concentration Inequalities , 2015, Found. Trends Mach. Learn..

[30]  Christos Boutsidis,et al.  Random Projections for Linear Support Vector Machines , 2012, TKDD.

[31]  Gunnar Rätsch,et al.  Invariant Feature Extraction and Classification in Kernel Spaces , 1999, NIPS.

[32]  Gene H. Golub,et al.  Matrix computations , 1983 .