An optimization criterion for generalized discriminant analysis on undersampled problems

An optimization criterion is presented for discriminant analysis. The criterion extends the optimization criteria of the classical Linear Discriminant Analysis (LDA) through the use of the pseudoinverse when the scatter matrices are singular. It is applicable regardless of the relative sizes of the data dimension and sample size, overcoming a limitation of classical LDA. The optimization problem can be solved analytically by applying the Generalized Singular Value Decomposition (GSVD) technique. The pseudoinverse has been suggested and used for undersampled problems in the past, where the data dimension exceeds the number of data points. The criterion proposed in this paper provides a theoretical justification for this procedure. An approximation algorithm for the GSVD-based approach is also presented. It reduces the computational complexity by finding subclusters of each cluster and uses their centroids to capture the structure of each cluster. This reduced problem yields much smaller matrices to which the GSVD can be applied efficiently. Experiments on text data, with up to 7,000 dimensions, show that the approximation algorithm produces results that are close to those produced by the exact algorithm.

[1]  J. Gower Some distance properties of latent root and vector methods used in multivariate analysis , 1966 .

[2]  David G. Stork,et al.  Pattern Classification , 1973 .

[3]  C. Loan Generalizing the Singular Value Decomposition , 1976 .

[4]  M. Saunders,et al.  Towards a Generalized Singular Value Decomposition , 1981 .

[5]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[6]  Qi Tian,et al.  Image Classification By The Foley-Sammon Transform , 1986 .

[7]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[8]  Gerard Salton,et al.  Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer , 1989 .

[9]  J. Friedman Regularized Discriminant Analysis , 1989 .

[10]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[11]  Kohji Fukunaga,et al.  Introduction to Statistical Pattern Recognition-Second Edition , 1990 .

[12]  Susan T. Dumais,et al.  Using Linear Algebra for Intelligent Information Retrieval , 1995, SIAM Rev..

[13]  R. Duin Small sample size generalization , 1995 .

[14]  W. V. McCarthy,et al.  Discriminant Analysis with Singular Covariance Matrices: Methods and Applications to Spectroscopic Data , 1995 .

[15]  Juyang Weng,et al.  Using Discriminant Eigenfeatures for Image Retrieval , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[17]  Robert P. W. Duin,et al.  Stabilizing classifiers for very small sample sizes , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[18]  David D. Lewis,et al.  Reuters-21578 Text Categorization Test Collection, Distribution 1.0 , 1997 .

[19]  M. F. Porter,et al.  An algorithm for suffix stripping , 1997 .

[20]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[21]  Robert P. W. Duin,et al.  Expected classification error of the Fisher linear classifier with pseudo-inverse covariance matrix , 1998, Pattern Recognit. Lett..

[22]  Thomas S. Huang,et al.  Small sample learning during multimedia retrieval using BiasMap , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[23]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[24]  Pierre Baldi,et al.  DNA Microarrays and Gene Expression - From Experiments to Data Analysis and Modeling , 2002 .

[25]  Aaron W. Mehay,et al.  Regularized Linear Discriminant Analysis of Wavelet Compressed Ion Mobility Spectra , 2002 .

[26]  J. B. Rosen,et al.  Lower Dimensional Representation of Text Data Based on Centroids and Least Squares , 2003 .

[27]  Haesun Park,et al.  Structure Preserving Dimension Reduction for Clustered Text Data Based on the Generalized Singular Value Decomposition , 2003, SIAM J. Matrix Anal. Appl..

[28]  Pong C. Yuen,et al.  Regularized discriminant analysis and its application to face recognition , 2003, Pattern Recognit..

[29]  Inderjit S. Dhillon,et al.  Concept Decompositions for Large Sparse Text Data Using Clustering , 2004, Machine Learning.

[30]  Haesun Park,et al.  Equivalence of Several Two-Stage Methods for Linear Discriminant Analysis , 2004, SDM.

[31]  Haesun Park,et al.  Generalizing discriminant analysis using the generalized singular value decomposition , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Pavel Pudil,et al.  Introduction to Statistical Pattern Recognition , 2006 .