Linear dimensionality reduction via a heteroscedastic extension of LDA: the Chernoff criterion

We propose an eigenvector-based heteroscedastic linear dimension reduction (LDR) technique for multiclass data. The technique is based on a heteroscedastic two-class technique which utilizes the so-called Chernoff criterion, and successfully extends the well-known linear discriminant analysis (LDA). The latter, which is based on the Fisher criterion, is incapable of dealing with heteroscedastic data in a proper way. For the two-class case, the between-class scatter is generalized so to capture differences in (co)variances. It is shown that the classical notion of between-class scatter can be associated with Euclidean distances between class means. From this viewpoint, the between-class scatter is generalized by employing the Chernoff distance measure, leading to our proposed heteroscedastic measure. Finally, using the results from the two-class case, a multiclass extension of the Chernoff criterion is proposed. This criterion combines separation information present in the class mean as well as the class covariance matrices. Extensive experiments and a comparison with similar dimension reduction techniques are presented.

[1]  Robert P. W. Duin,et al.  Multiclass Linear Dimension Reduction by Weighted Pairwise Fisher Criteria , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  C. R. Rao,et al.  The Utilization of Multiple Measurements in Problems of Biological Classification , 1948 .

[3]  Robert P. W. Duin,et al.  Non-iterative Heteroscedastic Linear Dimension Reduction for Two-Class Data , 2002, SSPR/SPR.

[4]  Jonny Eriksson,et al.  Feature reduction for classification of multidimensional data , 2000, Pattern Recognit..

[5]  M. Loog Approximate Pairwise Accuracy Criteria for Multiclass Linear Dimension Reduction: Generalisations of the Fisher Criterion , 1999 .

[6]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[7]  H. P. Decell,et al.  Feature combinations and the divergence criterion , 1977 .

[8]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[9]  Ljubomir J. Buturovic Toward Bayes-Optimal Linear Dimension Reduction , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Anuj Srivastava,et al.  Optimal linear representations of images for object recognition , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  W. A. Coberly,et al.  Linear dimension reduction and Bayes classification with unknown population parameters , 1982, Pattern Recognit..

[13]  Pavel Pudil,et al.  Introduction to Statistical Pattern Recognition , 2006 .

[14]  Geoffrey J. McLachlan,et al.  Discriminant Analysis and Statistical Pattern Recognition: McLachlan/Discriminant Analysis & Pattern Recog , 2005 .

[15]  J. Rice Mathematical Statistics and Data Analysis , 1988 .

[16]  G. McLachlan Discriminant Analysis and Statistical Pattern Recognition , 1992 .

[17]  Josef Kittler,et al.  Pattern recognition : a statistical approach , 1982 .

[18]  C. H. Chen,et al.  On information and distance measures, error bounds, and feature selection , 1976, Information Sciences.

[19]  Daniel S. Weld Comparative Analysis , 1987, IJCAI.

[20]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[21]  Olivier Y. de Vel,et al.  Comparative analysis of statistical pattern recognition methods in high dimensional settings , 1994, Pattern Recognit..

[22]  Shingo Tomita,et al.  An extended fisher criterion for feature extraction ‐ Malina's method and its problems , 1984 .

[23]  C. T. Ng,et al.  Measures of distance between probability distributions , 1989 .

[24]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[25]  Claus Weihs,et al.  Optimal vs. classical linear dimension reduction , 1998 .

[26]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[27]  H. P. Decell,et al.  Linear dimension reduction and Bayes classification , 1981, Pattern Recognit..