论文信息 - Lip Biometrics for Digit Recognition

Lip Biometrics for Digit Recognition

This paper presents a speaker-independent audio-visual digit recognition system that utilizes speech and visual lip signals. The extracted visual features are based on line-motion estimation obtained from video sequences with low resolution (128 × 128 pixels) to increase the robustness of audio recognition. The core experiments investigate lip motion biometrics as stand-alone as well as merged modality in speech recognition system. It uses Support Vector Machines, showing favourable experimental results with digit recognition featuring 83% to 100% on the XM2VTS database depending on the amount of available visual information.

Josef Bigün | Maycel Isaac Faraj | J. Bigün | M. Faraj

[1] G. Granlund. In search of a general picture processing operator , 1978 .

[2] Stan Davis,et al. Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[3] Johan Wiklund,et al. Multidimensional Orientation Estimation with Applications to Texture Analysis and Optical Flow , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[4] Johan Wiklund,et al. Multidimensional orientation : texture analysis and optical flow , 1991 .

[5] Roberto Brunelli,et al. Person identification using multiple cues , 1995, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6] J. Luettin,et al. Acoustic-labial Speaker Verification Pattern Recognition Letters Acoustic-labial Speaker Verification , 1997 .

[7] Bernhard Fröba,et al. SESAM: A Biometric Person Identification System Using Sensor Fusion , 1997, AVBPA.

[8] Stefan Fischer,et al. Face authentication with sparse grid Gabor information , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9] Juergen Luettin,et al. Evaluation Protocol for the extended M2VTS Database (XM2VTSDB) , 1998 .

[10] Jiri Matas,et al. XM2VTSDB: The Extended M2VTS Database , 1999 .

[11] Tsuhan Chen,et al. Audiovisual speech processing , 2001, IEEE Signal Process. Mag..

[12] Ara V. Nefian,et al. Speaker independent audio-visual continuous speech recognition , 2002, Proceedings. IEEE International Conference on Multimedia and Expo.

[13] Farzin Deravi,et al. A review of speech-based bimodal recognition , 2002, IEEE Trans. Multim..

[14] Chalapathy Neti,et al. Recent advances in the automatic recognition of audiovisual speech , 2003, Proc. IEEE.

[15] Zhifeng Li,et al. Video based face recognition using multiple classifiers , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[16] Josef Bigün,et al. Evaluating liveness by face images and the structure tensor , 2005, Fourth IEEE Workshop on Automatic Identification Advanced Technologies (AutoID'05).

[17] Josef Kittler,et al. Audio- and Video-Based Biometric Person Authentication, 5th International Conference, AVBPA 2005, Hilton Rye Town, NY, USA, July 20-22, 2005, Proceedings , 2005, AVBPA.

[18] Josef Bigün,et al. Person Verification by Lip-Motion , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[19] Josef Bigün,et al. Speaker and Digit Recognition by Audio-Visual Lip Biometrics , 2007, ICB.

[20] J. Bigun,et al. Speaker and Speech recognition by Audio-Visual lip biometrics , 2007 .

[21] Josef Bigün,et al. Audio-visual person authentication using lip-motion from orientation maps , 2007, Pattern Recognit. Lett..