Temporal normalization of videos using visual speech

Pose and illumination variation has been considered the major cause of poor recognition results in automatic face recognition as compared to other biometrics. With the advent of video based face recognition a decade ago we were presented with some new opportunities, algorithms were developed to take advantage of the abundance of data and behavioral aspect of recognition. But this modality introduced some new challenges also, one of them was the variation introduced by speech. In this paper we present a novel method for handling this variation by using temporal normalization based on lip motion. Evaluation was carried out by comparing face recognition results from original non-normalized videos and normalized videos.

[1]  K. Sugiyama,et al.  Motion compensated frame rate conversion using normalized motion estimation , 2005, IEEE Workshop on Signal Processing Systems Design and Implementation, 2005..

[2]  Tom Hintz,et al.  Kernel-based Subspace Analysis for Face Recognition , 2007, 2007 International Joint Conference on Neural Networks.

[3]  Demetri Terzopoulos,et al.  Snakes: Active contour models , 2004, International Journal of Computer Vision.

[4]  Azriel Rosenfeld,et al.  Face recognition: A literature survey , 2003, CSUR.

[5]  Rama Chellappa,et al.  A method for converting a smiling face to a neutral face with applications to face recognition , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[6]  Rama Chellappa,et al.  A system identification approach for video-based face recognition , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[7]  Jean-Luc Dugelay,et al.  Tomofaces: Eigenfaces extended to videos of speakers , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[8]  Yoshinobu Tonomura,et al.  Video tomography: an efficient method for camerawork extraction and motion analysis , 1994, MULTIMEDIA '94.

[9]  David J. Kriegman,et al.  Online learning of probabilistic appearance manifolds for video-based recognition and tracking , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[10]  Hong-Yuan Mark Liao,et al.  Person identification using facial motion , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[11]  Geoffrey E. Hinton,et al.  Learning Generative Texture Models with extended Fields-of-Experts , 2009, BMVC.

[12]  David J. Kriegman,et al.  Illumination cones for recognition under variable lighting: faces , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[13]  Lior Wolf,et al.  Learning over Sets using Kernel Principal Angles , 2003, J. Mach. Learn. Res..

[14]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  George Wolberg,et al.  Recent advances in image morphing , 1996, Proceedings of CG International '96.

[16]  Hsueh-Ming Hang,et al.  1999 IEEE Workshop on Signal Processing Systems : SiPS 99 : design and implementation , 1999 .

[17]  Trevor Darrell,et al.  Face Recognition from Long-Term Observations , 2002, ECCV.

[18]  Tae-Kyun Kim,et al.  Learning over Sets using Boosted Manifold Principal Angles (BoMPA) , 2005, BMVC.

[19]  Thomas Vetter,et al.  Face Recognition Based on Fitting a 3D Morphable Model , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  BlanzVolker,et al.  Face Recognition Based on Fitting a 3D Morphable Model , 2003 .

[21]  Ulrich Canzler,et al.  Extraction of Non Manual Features for Videobased Sign Language Recognition , 2002, MVA.

[22]  Brendan J. Frey,et al.  A probabilistic framework for embedded face and facial expression recognition , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).