A speaker independent "liveness" test for audio-visual biometrics

In biometrics, it is crucial to detect impostors and thwart replay attacks. However, few researches have focused yet on the “liveness” verifi cation. This test ensures that biometric cues being acquired are actual measurements from a live person who is present at the time of capture. Here, we propose a speaker independent “liveness” verifi cation method for audiovideo identifi cation systems. It uses the correlation that exists between the lip movements and the speech produced. Two data analysis methods are considered to model this statistical link. Finally, according to tests carried out on the XM2VTS database, the best liveness verifi cation EER achieved is 14.5% .

[1]  Roland Göcke,et al.  Statistical analysis of the relationship between audio and video speech parameters for Australian English , 2003, AVSP.

[2]  純一 長谷川,et al.  8th International Conference on Pattern Recognition(ICPR)に出席して , 1987 .

[3]  Sadaoki Furui,et al.  Concatenated phoneme models for text-variable speaker recognition , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  S. Dolédec,et al.  Co‐inertia analysis: an alternative method for studying species–environment relationships , 1994 .

[5]  Sharath Pankanti,et al.  Biometrics: a grand challenge , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[6]  Malcolm Slaney,et al.  FaceSync: A Linear Operator for Measuring Synchronization of Video Facial Images and Audio Tracks , 2000, NIPS.

[7]  Alice Caplier,et al.  Accurate and quasi-automatic lip tracking , 2004, IEEE Transactions on Circuits and Systems for Video Technology.

[8]  Jiri Matas,et al.  XM2VTSDB: The Extended M2VTS Database , 1999 .

[9]  Sharath Pankanti,et al.  Biometrics: a grand challenge , 2004, ICPR 2004.

[10]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[11]  Roberto Brunelli,et al.  Person identification using multiple cues , 1995, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  X. Zhang,et al.  Automatic speechreading with application to speaker verification , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[13]  Hani Yehia,et al.  Quantitative association of vocal-tract and facial behavior , 1998, Speech Commun..

[14]  Richard B. Reilly,et al.  Audio-Visual Speaker Identification Based on the Use of Dynamic Audio and Visual Features , 2003, AVBPA.

[15]  Farzin Deravi,et al.  A review of speech-based bimodal recognition , 2002, IEEE Trans. Multim..