Audio-Visual Speech Synchrony Measure for Talking-Face Identity Verification

We investigate the use of audio-visual speech synchrony measure in the framework of identity verification based on talking faces. Two synchrony measures based on canonical correlation analysis and co-inertia analysis respectively are introduced and their performances are evaluated on the specific task of detecting synchronized and not-synchronized audio-visual speech sequences. The notion of high-effort impostor attacks is also introduced as a dangerous threat for current biometric system based on speaker verification and face recognition. A novel biometric modality based on synchrony measures is introduced in order to improve the overall performance of identity verification, and more specifically its robustness to replay attacks.

[1]  N. Eveno,et al.  Co-inertia analysis for "liveness" test in audio-visual biometrics , 2005, ISPA 2005. Proceedings of the 4th International Symposium on Image and Signal Processing and Analysis, 2005..

[2]  G. Chollet,et al.  The BioSecure Talking-Face Reference System , 2006 .

[3]  Gérard Chollet,et al.  MEASURING AUDIO AND VISUAL SPEECH SYNCHRONY: METHODS AND APPLICATIONS , 2006 .

[4]  Claude C. Chibelushi,et al.  Integrated person identification using voice and facial features , 1997 .

[5]  Ian H. Witten,et al.  Detecting Replay Attacks in Audiovisual Identity Verification , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[6]  Roland Göcke,et al.  Statistical analysis of the relationship between audio and video speech parameters for Australian English , 2003, AVSP.

[7]  S. Dolédec,et al.  Co‐inertia analysis: an alternative method for studying species–environment relationships , 1994 .

[8]  Alvin F. Martin,et al.  The DET curve in assessment of detection task performance , 1997, EUROSPEECH.

[9]  Malcolm Slaney,et al.  FaceSync: A Linear Operator for Measuring Synchronization of Video Facial Images and Audio Tracks , 2000, NIPS.

[10]  A. Murat Tekalp,et al.  Multimodal Speaker Identification Using Canonical Correlation Analysis , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[11]  Jean-Philippe Thiran,et al.  The BANCA Database and Evaluation Protocol , 2003, AVBPA.