Optimal Subclass Discovery for Discriminant Analysis

Discriminant Analysis (DA) has had a big influence in many scientific disciplines. Unfortunately, DA algorithms need to make assumptions on the type of data available and, therefore, are not applicable everywhere. For example, when the data of each class can be represented by a single Gaussian and these share a common covariance matrix, Linear Discriminant Analysis (LDA) is a good option. In other cases, other DA approaches may be preferred. And, unfortunately, there still exist applications where no DA algorithm will correctly represent reality and, therefore, unsupervised techniques, such as Principal Components Analysis (PCA), may perform better. This paper first presents a theoretical study to define when and (most importantly) why DA techniques fail (Section 2). This is then used to create a new DA algorithm that can adapt to the training data available (Sections 2 and 3). The first main component of our solution is to design a method to automatically discover the optimal set of subclasses in each class. We will show that when this is achieved, optimal results can be obtained. The second main component of our algorithm is given by our theoretical study which defines a way to rapidly select the optimal number of subclasses. We present experimental results on two applications (object categorization and face recognition) and show that our method is always comparable or superior to LDA, Direct LDA (DLDA), Nonparametric DA (NDA) and PCA.

[1]  Bruce A. Draper,et al.  A nonparametric statistical comparison of principal component and linear discriminant subspaces for face recognition , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[2]  Hong Z. Tan,et al.  Template-based Recognition of Static Sitting Postures , 2003, 2003 Conference on Computer Vision and Pattern Recognition Workshop.

[3]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[4]  Christian Jutten,et al.  Blind separation of sources, part I: An adaptive algorithm based on neuromimetic architecture , 1991, Signal Process..

[5]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[6]  Rama Chellappa,et al.  Discriminant Analysis for Recognition of Human Face Images (Invited Paper) , 1997, AVBPA.

[7]  Hua Yu,et al.  A direct LDA algorithm for high-dimensional data - with application to face recognition , 2001, Pattern Recognit..

[8]  Juyang Weng,et al.  Using Discriminant Eigenfeatures for Image Retrieval , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Bernt Schiele,et al.  Analyzing appearance and contour based methods for object categorization , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[10]  Eric R. Ziegel,et al.  Statistical Methods in Bioinformatics , 2002, Technometrics.

[11]  Avinash C. Kak,et al.  PCA versus LDA , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Marian Stewart Bartlett,et al.  Face image analysis by unsupervised learning , 2001 .

[13]  Ravi Kothari,et al.  Fractional-Step Dimensionality Reduction , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  J. Friedman Regularized Discriminant Analysis , 1989 .

[15]  Pavel Pudil,et al.  Introduction to Statistical Pattern Recognition , 2006 .

[16]  R. Fisher THE STATISTICAL UTILIZATION OF MULTIPLE MEASUREMENTS , 1938 .

[17]  Ker-Chau Li,et al.  Sliced Inverse Regression for Dimension Reduction , 1991 .

[18]  Gregory R. Grant,et al.  Statistical Methods in Bioinformatics , 2001 .

[19]  C. Radhakrishna Rao [Prediction of Future Observations in Growth Curve Models]: Rejoinder , 1987 .

[20]  Aleix M. Martínez,et al.  Subclass discriminant analysis , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  S. Weisberg,et al.  Comments on "Sliced inverse regression for dimension reduction" by K. C. Li , 1991 .