论文信息 - Audio-Visual Speech Synchrony Measure for Talking-Face Identity Verification

Audio-Visual Speech Synchrony Measure for Talking-Face Identity Verification

We investigate the use of audio-visual speech synchrony measure in the framework of identity verification based on talking faces. Two synchrony measures based on canonical correlation analysis and co-inertia analysis respectively are introduced and their performances are evaluated on the specific task of detecting synchronized and not-synchronized audio-visual speech sequences. The notion of high-effort impostor attacks is also introduced as a dangerous threat for current biometric system based on speaker verification and face recognition. A novel biometric modality based on synchrony measures is introduced in order to improve the overall performance of identity verification, and more specifically its robustness to replay attacks.

Gérard Chollet | Hervé Bredin | G. Chollet | H. Bredin

[1] N. Eveno,et al. Co-inertia analysis for "liveness" test in audio-visual biometrics , 2005, ISPA 2005. Proceedings of the 4th International Symposium on Image and Signal Processing and Analysis, 2005..

[2] G. Chollet,et al. The BioSecure Talking-Face Reference System , 2006 .

[3] Gérard Chollet,et al. MEASURING AUDIO AND VISUAL SPEECH SYNCHRONY: METHODS AND APPLICATIONS , 2006 .

[4] Claude C. Chibelushi,et al. Integrated person identification using voice and facial features , 1997 .

[5] Ian H. Witten,et al. Detecting Replay Attacks in Audiovisual Identity Verification , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[6] Roland Göcke,et al. Statistical analysis of the relationship between audio and video speech parameters for Australian English , 2003, AVSP.

[7] S. Dolédec,et al. Co‐inertia analysis: an alternative method for studying species–environment relationships , 1994 .

[8] Alvin F. Martin,et al. The DET curve in assessment of detection task performance , 1997, EUROSPEECH.

[9] Malcolm Slaney,et al. FaceSync: A Linear Operator for Measuring Synchronization of Video Facial Images and Audio Tracks , 2000, NIPS.

[10] A. Murat Tekalp,et al. Multimodal Speaker Identification Using Canonical Correlation Analysis , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[11] Jean-Philippe Thiran,et al. The BANCA Database and Evaluation Protocol , 2003, AVBPA.