Total variability modelling for face verification

This study presents the first detailed study of total variability modelling (TVM) for face verification. TVM was originally proposed for speaker verification, where it has been accepted as state-of-the-art technology. Also referred to as front-end factor analysis, TVM uses a probabilistic model to represent a speech recording as a low-dimensional vector known as an ` i -vector'. This representation has been successfully applied to a wide variety of speech-related pattern recognition applications, and remains a hot topic in biometrics. In this work, the authors extend the application of i -vectors beyond the domain of speech to a novel representation of facial images for the purpose of face verification. Extensive experimentation on several challenging and publicly available face recognition databases demonstrates that TVM generalises well to this modality, providing between 17 and 39% relative reduction in verification error rate compared to a baseline Gaussian mixture model system. Several i -vector session compensation and scoring techniques were evaluated including source-normalised linear discriminant analysis (SN-LDA), probabilistic LDA and within-class covariance normalisation. Finally, this study provides a detailed comparison of the complexity of TVM, highlighting some important computational advantages with respect to related state-of-the-art techniques.

[1]  Patrick Kenny,et al.  A Study of Interspeaker Variability in Speaker Verification , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[2]  Roland Auckenthaler,et al.  Score Normalization for Text-Independent Speaker Verification Systems , 2000, Digit. Signal Process..

[3]  Samy Bengio,et al.  User authentication via adapted statistical models of face images , 2006, IEEE Transactions on Signal Processing.

[4]  Patrick Kenny,et al.  Front-End Factor Analysis for Speaker Verification , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[5]  Sridha Sridharan,et al.  Explicit modelling of session variability for speaker verification , 2008, Comput. Speech Lang..

[6]  Mislav Grgic,et al.  SCface – surveillance cameras face database , 2011, Multimedia Tools and Applications.

[7]  Samy Bengio,et al.  A statistical significance test for person authentication , 2004, Odyssey.

[8]  Xiaoyang Tan,et al.  Enhanced Local Texture Feature Sets for Face Recognition Under Difficult Lighting Conditions , 2007, IEEE Transactions on Image Processing.

[9]  Sébastien Marcel,et al.  Inter-session variability modelling and joint factor analysis for face authentication , 2011, 2011 International Joint Conference on Biometrics (IJCB).

[10]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[11]  Sébastien Marcel,et al.  Cross-Pollination of Normalization Techniques From Speaker to Face Authentication Using Gaussian Mixture Models , 2012, IEEE Transactions on Information Forensics and Security.

[12]  Kuldip K. Paliwal,et al.  Fast features for face authentication under illumination direction changes , 2003, Pattern Recognit. Lett..

[13]  Patrick Kenny,et al.  Joint Factor Analysis Versus Eigenchannels in Speaker Recognition , 2007, IEEE Transactions on Audio, Speech, and Language Processing.