Regularized generalized eigen-decomposition with applications to sparse supervised feature extraction and sparse discriminant analysis

We propose a general technique for obtaining sparse solutions to generalized eigenvalue problems, and call it Regularized Generalized Eigen-Decomposition (RGED). For decades, Fisher's discriminant criterion has been applied in supervised feature extraction and discriminant analysis, and it is formulated as a generalized eigenvalue problem. Thus RGED can be applied to effectively extract sparse features and calculate sparse discriminant directions for all variants of Fisher discriminant criterion based models. Particularly, RGED can be applied to matrix-based and even tensor-based discriminant techniques, for instance, 2D-Linear Discriminant Analysis (2D-LDA). Furthermore, an iterative algorithm based on the alternating direction method of multipliers is developed. The algorithm approximately solves RGED with monotonically decreasing convergence and at an acceptable speed for results of modest accuracy. Numerical experiments based on four data sets of different types of images show that RGED has competitive classification performance with existing multidimensional and sparse techniques of discriminant analysis. HighlightsWe propose a new technique called Regularized Generalized Eigen Decomposition (RGED).RGED solves generalized eigenvalue problems and obtains sparse solutions.It is easy and straightforward applying RGED to sparse discriminant analysis and feature extraction.An algorithm is developed to solve it with monotonically decreasing convergence.RGED has competitive classification performance comparing with other methods.

[1]  Xin Zhao,et al.  Human action recognition based on semi-supervised discriminant analysis with global constraint , 2013, Neurocomputing.

[2]  Xingzhi Zhan Matrix Inequalities , 2002 .

[3]  Xiaojun Wu,et al.  Discriminant analysis approach using fuzzy fourfold subspaces model , 2010, Neurocomputing.

[4]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[5]  Xihong Lin,et al.  Sparse linear discriminant analysis for simultaneous testing for the significance of a gene set/pathway and gene selection , 2009, Bioinform..

[6]  Jian Yang,et al.  Essence of kernel Fisher discriminant: KPCA plus LDA , 2004, Pattern Recognit..

[7]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[8]  Ja-Chen Lin,et al.  A new LDA-based face recognition system which can solve the small sample size problem , 1998, Pattern Recognit..

[9]  Trevor J. Hastie,et al.  Sparse Discriminant Analysis , 2011, Technometrics.

[10]  R. Tibshirani,et al.  Sparse Principal Component Analysis , 2006 .

[11]  Jianhua Z. Huang,et al.  Sparse principal component analysis via regularized low rank matrix approximation , 2008 .

[12]  Kuldip K. Paliwal,et al.  A new perspective to null linear discriminant analysis method and its fast implementation using random matrix multiplication with scatter matrices , 2012, Pattern Recognit..

[14]  Angelika Garz,et al.  ICDAR 2013 Competition on Handwritten Digit Recognition (HDRC 2013) , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[15]  Gert R. G. Lanckriet,et al.  Sparse eigen methods by D.C. programming , 2007, ICML '07.

[16]  Yu Weiwei Two-dimensional discriminant locality preserving projections for face recognition , 2009, Pattern Recognit. Lett..

[17]  Jianhua Z. Huang,et al.  Sparse Linear Discriminant Analysis with Applications to High Dimensional Low Sample Size Data , 2009 .

[18]  T. Cai,et al.  A Direct Estimation Approach to Sparse Linear Discriminant Analysis , 2011, 1107.3442.

[19]  R. Tibshirani,et al.  Penalized classification using Fisher's linear discriminant , 2011, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[20]  Feiping Nie,et al.  Trace Ratio Problem Revisited , 2009, IEEE Transactions on Neural Networks.

[21]  Jian-Huang Lai,et al.  1D-LDA vs. 2D-LDA: When is vector-based linear discriminant analysis better than matrix-based? , 2008, Pattern Recognit..

[22]  Konstantinos N. Plataniotis,et al.  An efficient kernel discriminant analysis method , 2005, Pattern Recognit..

[23]  Shinichi Nakajima,et al.  Semi-Supervised Local Fisher Discriminant Analysis for Dimensionality Reduction , 2008, PAKDD.

[24]  Andy Harter,et al.  Parameterisation of a stochastic model for human face identification , 1994, Proceedings of 1994 IEEE Workshop on Applications of Computer Vision.

[25]  Dong Xu,et al.  Trace Ratio vs. Ratio Trace for Dimensionality Reduction , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Hui Xu,et al.  Two-dimensional supervised local similarity and diversity projection , 2010, Pattern Recognit..

[27]  Zi Huang,et al.  Dimensionality reduction by Mixed Kernel Canonical Correlation Analysis , 2012, Pattern Recognition.

[28]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[29]  N. Campbell Shrunken Estimators in Discriminant and Canonical Variate Analysis , 1980 .

[30]  Jieping Ye,et al.  Two-Dimensional Linear Discriminant Analysis , 2004, NIPS.

[31]  R. Tibshirani,et al.  Penalized Discriminant Analysis , 1995 .

[32]  J. Shao,et al.  Sparse linear discriminant analysis by thresholding for high dimensional data , 2011, 1105.3561.

[33]  Xixuan Han,et al.  On Weighted Support Vector Regression , 2014, Qual. Reliab. Eng. Int..

[34]  Pengfei Shi,et al.  A note on two-dimensional linear discriminant analysis , 2008, Pattern Recognit. Lett..

[35]  Qing Mai,et al.  A review of discriminant analysis in high dimensions , 2013 .

[36]  Georgios B. Giannakis,et al.  Distributed In-Network Channel Decoding , 2009, IEEE Transactions on Signal Processing.

[37]  Jiawei Han,et al.  Semi-supervised Discriminant Analysis , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[38]  A. Hmamed,et al.  A matrix inequality , 1989 .

[39]  Heiko Hoffmann,et al.  Kernel PCA for novelty detection , 2007, Pattern Recognit..

[40]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[41]  Richard G. Baraniuk,et al.  Fast Alternating Direction Optimization Methods , 2014, SIAM J. Imaging Sci..

[42]  Z LiStan,et al.  1D-LDA vs. 2D-LDA , 2008 .