A study of the generalized eigenvalue decomposition in discriminant analysis

The well-known Linear Discriminant Analysis (LDA) approac h to feature extraction in classification problems is typically formulated using a gen eralized eigenvalue decomposition, S1V = S2VΛ, whereS1 andS2 are two symmetric, positive-semidefinite matrices defining the measure to be maximized and that to be minimized. Most of the LDA algorithms developed to date are based on tuning one of these two m atrices to solve a specific problem. However, the search for a set of metrics that can be a pplied to a large number of problems has met difficulty. In this thesis, we take the view t hat most of these problem are caused by the use of the generalized eigenvalue decompositi on equation described above. Further, we argue that many of these problems can be solved by stud ing and modifying this basic equation. At the core of this thesis lays a new fact orization ofS−1 2 S1 that can be used to resolve several of the problems of LDA. Three novel algorithms are derived, each based on our propos ed factorization. In the first algorithm, we define a criterion to prune noisy bases in L DA. This is possible thanks to the flexibility of our factorization, which allows the sup pression of a set of vectors of any metric. The second algorithm is called Subclass Discrimina nt Analysis (SDA). SDA can be applied to a large variety of distribution types because i t approximates the underlying distribution of each class with a mixture of Gaussians. The m ost convenient number of Gaussians can be readily selected thanks to our proposed fac toriz tion. The third algorithm is aimed to address the over-fitting issue in LDA. A direct app lication of this algorithm is

[1]  Richard M. Simon,et al.  A Paradigm for Class Prediction Using Gene Expression Profiles , 2003, J. Comput. Biol..

[2]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[3]  John H. L. Hansen,et al.  Advances in phone-based modeling for automatic accent classification , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[4]  R. Tibshirani,et al.  Discriminant Analysis by Gaussian Mixtures , 1996 .

[5]  G. Church,et al.  Systematic determination of genetic network architecture , 1999, Nature Genetics.

[6]  Ja-Chen Lin,et al.  A new LDA-based face recognition system which can solve the small sample size problem , 1998, Pattern Recognit..

[7]  Hua Yu,et al.  A direct LDA algorithm for high-dimensional data - with application to face recognition , 2001, Pattern Recognit..

[8]  Robert P. W. Duin,et al.  Linear dimensionality reduction via a heteroscedastic extension of LDA: the Chernoff criterion , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Tristrom Cooke,et al.  The Optimal Classification Using a Linear Discriminant for Two Point Classes Having Known Mean and Covariance , 2002 .

[10]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[11]  Hiroyuki Shimizu,et al.  Handwritten numeral recognition with the improved LDA method , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[12]  Xue-wen Chen,et al.  Gene selection for cancer classification using bootstrapped genetic algorithms and support vector machines , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[13]  Avinash C. Kak,et al.  PCA versus LDA , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  R. Tibshirani,et al.  Flexible Discriminant Analysis by Optimal Scoring , 1994 .

[15]  J. Mesirov,et al.  Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[16]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[17]  T. W. Anderson,et al.  Classification into two Multivariate Normal Distributions with Different Covariance Matrices , 1962 .

[18]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[19]  Ravi Kothari,et al.  Fractional-Step Dimensionality Reduction , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Konstantinos N. Plataniotis,et al.  Face recognition using kernel direct discriminant analysis algorithms , 2003, IEEE Trans. Neural Networks.

[21]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[22]  E. Boerwinkle,et al.  Feature (gene) selection in gene expression-based tumor classification. , 2001, Molecular genetics and metabolism.

[23]  I. Mian,et al.  Analysis of molecular profile data using generative and discriminative methods. , 2000, Physiological genomics.

[24]  Tristrom Cooke,et al.  Two Variations on Fisher's Linear Discriminant for Pattern Recognition , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[26]  David Zhang,et al.  An improved LDA approach , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[27]  Juyang Weng,et al.  Using Discriminant Eigenfeatures for Image Retrieval , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[28]  S Michiels,et al.  Prediction of cancer outcome with microarrays , 2005, The Lancet.

[29]  Narendra Ahuja,et al.  Face Detection Using Multimodal Density Models , 2001, Comput. Vis. Image Underst..

[30]  R. Fisher THE STATISTICAL UTILIZATION OF MULTIPLE MEASUREMENTS , 1938 .

[31]  Alan R. Dabney BIOINFORMATICS Classification of Microarrays to Nearest Centroids , 2022 .

[32]  David Zhang,et al.  UODV: improved algorithm and generalized theory , 2003, Pattern Recognit..

[33]  Ming-Hsuan Yang,et al.  Kernel Eigenfaces vs. Kernel Fisherfaces: Face recognition using kernel methods , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[34]  Tao Li,et al.  A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression , 2004, Bioinform..

[35]  Daniel Q. Naiman,et al.  Simple decision rules for classifying human cancers from gene expression profiles , 2005, Bioinform..

[36]  Trevor Hastie,et al.  Gene Shaving: a new class of clustering methods for expression arrays , 2000 .

[37]  R. Nadon,et al.  Inferential literacy for experimental high-throughput biology. , 2006, Trends in genetics : TIG.

[38]  D. N. Geary Mixture Models: Inference and Applications to Clustering , 1989 .

[39]  J M England Discriminant functions. , 1989, Blood cells.

[40]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[41]  Konstantinos N. Plataniotis,et al.  Face recognition using LDA-based algorithms , 2003, IEEE Trans. Neural Networks.

[42]  J. Friedman Regularized Discriminant Analysis , 1989 .

[43]  T. Poggio,et al.  Prediction of central nervous system embryonal tumour outcome based on gene expression , 2002, Nature.

[44]  D Haussler,et al.  Knowledge-based analysis of microarray gene expression data by using support vector machines. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[45]  G. Baudat,et al.  Generalized Discriminant Analysis Using a Kernel Approach , 2000, Neural Computation.

[46]  Erkki Oja,et al.  Independent Component Analysis , 2001 .

[47]  C. R. Rao,et al.  The Utilization of Multiple Measurements in Problems of Biological Classification , 1948 .

[48]  F. Valafar Pattern Recognition Techniques in Microarray Data Analysis : A Survey , 2002 .

[49]  Tom Froese,et al.  Comparison of extrasystolic ECG signal classifiers using discrete wavelet transforms , 2006, Pattern Recognit. Lett..

[50]  Jordi Vitrià,et al.  Clustering in image space for place recognition and visual annotations for human-robot interaction , 2001, IEEE Trans. Syst. Man Cybern. Part B.

[51]  Takeo Kanade,et al.  Oriented Discriminant Analysis (ODA) , 2004 .

[52]  D. Zhang,et al.  Principle Component Analysis , 2004 .

[53]  Pierre A. Devijver Pattern recognition , 1982 .

[54]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[55]  Bernt Schiele,et al.  Analyzing contour and appearance based methods for object categorization , 2003, CVPR 2003.

[56]  Rama Chellappa,et al.  Multiple-exemplar discriminant analysis for face recognition , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[57]  Daniel Q. Naiman,et al.  Classifying Gene Expression Profiles from Pairwise mRNA Comparisons , 2004, Statistical applications in genetics and molecular biology.

[58]  Robert P. W. Duin,et al.  Multiclass Linear Dimension Reduction by Weighted Pairwise Fisher Criteria , 2001, IEEE Trans. Pattern Anal. Mach. Intell..