Investigating feature-level fusion for checking liveness in face-voice authentication

In this paper we propose a feature level fusion approach for checking liveness in face-voice person authentication. Liveness verification experiments conducted on two audiovisual databases, VidTIMIT and UCBN, show that feature-level fusion is indeed a powerful technique for checking liveness in systems that are vulnerable to replay attacks, as it preserves synchronisation between closely coupled modalities, such as voice and face, through various stages of authentication. An improvement in error rate of the order of 25-40% is achieved for replay attack experiments by using feature level fusion of acoustic and visual feature vectors from lip region as compared to classical late fusion approach.

[1]  Arun Ross,et al.  Information fusion in biometrics , 2003, Pattern Recognit. Lett..

[2]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[3]  Sun-Yuan Kung,et al.  Multi-sample fusion with constrained feature transformation for robust speaker verification , 2004, INTERSPEECH.

[4]  Jiri Matas,et al.  Combining evidence in personal identity verification systems , 1997, Pattern Recognit. Lett..

[5]  Kuldip K. Paliwal,et al.  Fast features for face authentication under illumination direction changes , 2003, Pattern Recognit. Lett..

[6]  Roland Auckenthaler,et al.  Improving a GMM speaker verification system by phonetic weighting , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).