Multimodal oriented discriminant analysis

Linear discriminant analysis (LDA) has been an active topic of research during the last century. However, the existing algorithms have several limitations when applied to visual data. LDA is only optimal for Gaussian distributed classes with equal covariance matrices, and only classes-1 features can be extracted. On the other hand, LDA does not scale well to high dimensional data (overfitting), and it cannot handle optimally multimodal distributions. In this paper, we introduce Multimodal Oriented Discriminant Analysis (MODA), a LDA extension which can overcome these drawbacks. A new formulation and several novelties are proposed:• An optimal dimensionality reduction for multimodal Gaussian classes with different covariances is derived. The new criteria allows for extracting more than classes-1 features.• A covariance approximation is introduced to improve generalization and avoid over-fitting when dealing with high dimensional data.• A linear time iterative majorization method is suggested in order to find a local optimum.Several synthetic and real experiments on face recognition show that MODA outperform existing linear techniques.

[1]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[2]  Takeo Kanade,et al.  Shape and motion from image streams under orthography: a factorization method , 1992, International Journal of Computer Vision.

[3]  Keinosuke Fukunaga,et al.  Effects of Sample Size in Classifier Design , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Wenyi Zhao,et al.  Discriminant component analysis for face recognition , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[5]  Simon Baker,et al.  Active Appearance Models Revisited , 2004, International Journal of Computer Vision.

[6]  Shaogang Gong,et al.  Recognising trajectories of facial identities using kernel discriminant analysis , 2003, Image and Vision Computing.

[7]  Anil K. Jain,et al.  Small Sample Size Effects in Statistical Pattern Recognition: Recommendations for Practitioners , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  George Saon,et al.  Maximum likelihood discriminant feature spaces , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[9]  Sanja Fidler,et al.  Robust LDA Classification by Subsampling , 2003, 2003 Conference on Computer Vision and Pattern Recognition Workshop.

[10]  Michael E. Tipping,et al.  Probabilistic Principal Component Analysis , 1999 .

[11]  N. Campbell CANONICAL VARIATE ANALYSIS—A GENERAL MODEL FORMULATION , 1984 .

[12]  Y. Freund,et al.  Discussion of the Paper \additive Logistic Regression: a Statistical View of Boosting" By , 2000 .

[13]  Michael J. Black,et al.  A Framework for Robust Subspace Learning , 2003, International Journal of Computer Vision.

[14]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[15]  Hua Yu,et al.  A direct LDA algorithm for high-dimensional data - with application to face recognition , 2001, Pattern Recognit..

[16]  Timothy F. Cootes,et al.  Active Appearance Models , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Jan de Leeuw,et al.  Block-relaxation Algorithms in Statistics , 1994 .

[18]  Trevor Hastie,et al.  Flexible discriminant and mixture models , 2000 .

[19]  Ja-Chen Lin,et al.  A new LDA-based face recognition system which can solve the small sample size problem , 1998, Pattern Recognit..

[20]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[21]  David G. Lowe,et al.  Optimized Feature Extraction and the Bayes Decision in Feed-Forward Classifier Networks , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  P. GALLINARI,et al.  On the relations between discriminant analysis and multilayer perceptrons , 1991, Neural Networks.

[23]  Juha Karhunen,et al.  Principal component neural networks — Theory and applications , 1998, Pattern Analysis and Applications.

[24]  Michael I. Jordan Learning in Graphical Models , 1999, NATO ASI Series.

[25]  Geoffrey E. Hinton,et al.  A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.

[26]  Hans-Hermann Bock,et al.  Information Systems and Data Analysis , 1994 .

[27]  J. Magnus,et al.  Matrix Differential Calculus with Applications in Statistics and Econometrics (Revised Edition) , 1999 .

[28]  M. Turk,et al.  Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[29]  David G. Stork,et al.  Pattern Classification , 1973 .

[30]  Avinash C. Kak,et al.  PCA versus LDA , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[31]  Henk A. L. Kiers Maximization of sums of quotients of quadratic forms and some generalizations , 1995 .

[32]  G. F. Hughes,et al.  On the mean accuracy of statistical pattern recognizers , 1968, IEEE Trans. Inf. Theory.

[33]  Andreas G. Andreou,et al.  Heteroscedastic discriminant analysis and reduced rank HMMs for improved speech recognition , 1998, Speech Commun..

[34]  Kurt Hornik,et al.  Neural networks and principal component analysis: Learning from examples without local minima , 1989, Neural Networks.

[35]  Jianbo Shi,et al.  Multiclass spectral clustering , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[36]  Alex Pentland,et al.  Probabilistic Visual Learning for Object Representation , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[37]  David A. Landgrebe,et al.  Covariance Matrix Estimation and Classification With Limited Training Data , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[38]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[39]  Kohji Fukunaga,et al.  Introduction to Statistical Pattern Recognition-Second Edition , 1990 .

[40]  Takeo Kanade,et al.  Omnidirectional Video Capturing, Multiple People Tracking and Identification for Meeting Monitoring , 2005 .