Linear Feature Extraction with Emphasis on Face Recognition

Linear Feature Extraction with Emphasis on Face Recognition Mohammad Shahin Mahanta Master of Applied Science Graduate Department of Electrical and Computer Engineering University of Toronto 2009 Feature extraction is an important step in the classification of high-dimensional data such as face images. Furthermore, linear feature extractors are more prevalent due to computational efficiency and preservation of the Gaussianity. This research proposes a simple and fast linear feature extractor approximating the sufficient statistic for Gaussian distributions. This method preserves the discriminatory information in both first and second moments of the data and yields the linear discriminant analysis as a special case. Additionally, an accurate upper bound on the error probability of a plug-in classifier can be used to approximate the number of features minimizing the error probability. Therefore, tighter error bounds are derived in this work based on the Bayes error or the classification error on the trained distributions. These bounds can also be used for performance guarantee and to determine the required number of training samples to guarantee approaching the Bayes classifier performance.

[1]  Harry Wechsler,et al.  The FERET database and evaluation procedure for face-recognition algorithms , 1998, Image Vis. Comput..

[2]  Aleix M. Martínez,et al.  Pruning Noisy Bases in Discriminant Analysis , 2008, IEEE Transactions on Neural Networks.

[3]  Avinash C. Kak,et al.  PCA versus LDA , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Rama Chellappa,et al.  Face recognition from video: a CONDENSATION approach , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[5]  Dirk Van Compernolle,et al.  Optimal feature sub-space selection based on discriminant analysis , 1999, EUROSPEECH.

[6]  H. P. Decell,et al.  Characterizations of linear sufficient statistics , 1976 .

[7]  George Saon,et al.  Minimum Bayes Error Feature Selection for Continuous Speech Recognition , 2000, NIPS.

[8]  Konstantinos N. Plataniotis,et al.  Discriminant learning for face recognition , 2004 .

[9]  Ian T. Young,et al.  A distribution-free geometric upper bound for the probability of error of a minimum distance classifier , 1978, Pattern Recognit..

[10]  Calyampudi R. Rao,et al.  Linear statistical inference and its applications , 1965 .

[11]  Josef Kittler,et al.  Pattern recognition : a statistical approach , 1982 .

[12]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[13]  C. H. Chen,et al.  On information and distance measures, error bounds, and feature selection , 1976, Information Sciences.

[14]  Anil K. Jain,et al.  Handbook of Face Recognition, 2nd Edition , 2011 .

[15]  Tatsuya Kubokawa,et al.  Comparison of Discrimination Methods for High Dimensional Data , 2005 .

[16]  Azriel Rosenfeld,et al.  Face recognition: A literature survey , 2003, CSUR.

[17]  Keinosuke Fukunaga,et al.  Estimation of Classification Error , 1970, IEEE Transactions on Computers.

[18]  Peng Zhang,et al.  Discriminant analysis: a unified approach , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[19]  Raymond Veldhuis,et al.  Eigenvalue correction results in face recognition , 2008 .

[20]  Šarūnas Raudys,et al.  Statistical and Neural Classifiers: An Integrated Approach to Design , 2012 .

[21]  Terence Sim,et al.  The CMU Pose, Illumination, and Expression (PIE) database , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[22]  Aleix M. Martínez,et al.  Selecting Principal Components in a Two-Stage LDA Algorithm , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[23]  P. Sinha,et al.  Contribution of Color to Face Recognition , 2002, Perception.

[24]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  W. A. Coberly,et al.  Linear dimension reduction and Bayes classification with unknown population parameters , 1982, Pattern Recognit..

[26]  Haiping Lu,et al.  Multilinear Subspace Learning for Face and Gait Recognition , 2009 .

[27]  Luis Rueda A one-dimensional analysis for the probability of error of linear classifiers for normally distributed classes , 2005, Pattern Recognit..

[28]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[29]  Hua Yu,et al.  A direct LDA algorithm for high-dimensional data - with application to face recognition , 2001, Pattern Recognit..

[30]  Norbert Krüger,et al.  Face Recognition by Elastic Bunch Graph Matching , 1997, CAIP.

[31]  Dick E. Boekee,et al.  A class of lower bounds on the Bayesian probability of error , 1981, Inf. Sci..

[32]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[33]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[34]  C. McDermott Discrimination , 2009, Inclusive Equality.

[35]  Olivier Gascuel,et al.  Distribution-free performance bounds with the resubstitution error estimate , 1992, Pattern Recognit. Lett..

[36]  H. P. Decell,et al.  Linear dimension reduction and Bayes classification , 1981, Pattern Recognit..

[37]  Aleix M. Martínez,et al.  Bayes Optimality in Linear Discriminant Analysis , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Patrick L. Odell,et al.  Generalized Inverse Matrices , 1971 .

[39]  László Györfi,et al.  A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.

[40]  J. Rice Mathematical Statistics and Data Analysis , 1988 .

[41]  T. Moon,et al.  Mathematical Methods and Algorithms for Signal Processing , 1999 .

[42]  A. G. Wacker,et al.  Effect of dimensionality and estimation on the performance of gaussian classifiers , 1980, Pattern Recognit..

[43]  D. Harville Matrix Algebra From a Statistician's Perspective , 1998 .

[44]  J. V. Ness,et al.  On the Effects of Dimension in Discriminant Analysis , 1976 .

[45]  Robert P. W. Duin,et al.  Linear dimensionality reduction via a heteroscedastic extension of LDA: the Chernoff criterion , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  Fakhri Karray,et al.  Toward a tight upper bound for the error probability of the binary Gaussian classification problem , 2008, Pattern Recognit..

[47]  Pierre A. Devijver,et al.  On a New Class of Bounds on Bayes Risk in Multihypothesis Pattern Recognition , 1974, IEEE Transactions on Computers.

[48]  R. Sitgreaves SOME OPERATING CHARACTERISTICS OF LINEAR DISCRIMINANT FUNCTIONS , 1973 .

[49]  Ravi Kothari,et al.  Fractional-Step Dimensionality Reduction , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[50]  Robert P. W. Duin,et al.  On the Choice of Smoothing Parameters for Parzen Estimators of Probability Density Functions , 1976, IEEE Transactions on Computers.

[51]  Konstantinos N. Plataniotis,et al.  Regularization studies on LDA for face recognition , 2004, 2004 International Conference on Image Processing, 2004. ICIP '04..

[52]  Luc Devroye,et al.  Distribution-free performance bounds with the resubstitution error estimate (Corresp.) , 1979, IEEE Trans. Inf. Theory.

[53]  Anil K. Jain,et al.  On the optimal number of features in the classification of multivariate Gaussian data , 1978, Pattern Recognit..

[54]  L. Scharf,et al.  Statistical Signal Processing: Detection, Estimation, and Time Series Analysis , 1991 .

[55]  Seymour Geisser,et al.  Discrimination, Allocatory and Separatory, Linear Aspects , 1977 .

[56]  Haiping Lu,et al.  Uncorrelated Multilinear Discriminant Analysis With Regularization and Aggregation for Tensor Object Recognition , 2009, IEEE Transactions on Neural Networks.

[57]  L. J. Savage,et al.  Application of the Radon-Nikodym Theorem to the Theory of Sufficient Statistics , 1949 .

[58]  Konstantinos N. Plataniotis,et al.  Face recognition using LDA-based algorithms , 2003, IEEE Trans. Neural Networks.

[59]  Hyeonjoon Moon,et al.  The FERET Evaluation Methodology for Face-Recognition Algorithms , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[60]  T. Subba Rao,et al.  Classification, Parameter Estimation and State Estimation: An Engineering Approach Using MATLAB , 2004 .

[61]  M. Basseville Distance measures for signal processing and pattern recognition , 1989 .

[62]  Ljubomir J. Buturovic On the minimal dimension of sufficient statistics , 1992, IEEE Trans. Inf. Theory.

[63]  Steven J. Leon Linear Algebra With Applications , 1980 .

[64]  Zhu Ming-han,et al.  Fisher linear discriminant analysis algorithm based on vector muster , 2011 .

[65]  Nicholas J. Higham,et al.  A Schur-Parlett Algorithm for Computing Matrix Functions , 2003, SIAM J. Matrix Anal. Appl..

[66]  Jonny Eriksson,et al.  Feature reduction for classification of multidimensional data , 2000, Pattern Recognit..

[67]  Anil K. Jain,et al.  Biometric Systems: Technology, Design and Performance Evaluation , 2004 .

[68]  Robert P. W. Duin,et al.  Multiclass Linear Dimension Reduction by Weighted Pairwise Fisher Criteria , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[69]  O. de Vel,et al.  New Fast Algorithms for Error Rate-Based Stepwise Variable Selection in Discriminant Analysis , 2000, SIAM J. Sci. Comput..

[70]  Zixiang Xiong,et al.  Optimal number of features as a function of sample size for various classification rules , 2005, Bioinform..

[71]  J. Friedman Regularized Discriminant Analysis , 1989 .