Can a Professional Imitator Fool a GMM-Based Speaker Verification System?

This paper presents an attempt at assessing empirically how a state-of-the-art text-independent speaker verification system behaves when confronted to imposting attempts from a professional imitator who perfectly knows how to imitate in particular the clients he tried to impost. Empirical evidence show that, fortunately, current speaker verification systems are indeed robust to such attempts, even when humans are not able to discriminate between true and imposting accesses (a website with some examples is provided to convince the reader). Furthermore, we show that the knowledge of the lexical content of the access significantly helps the imitator, although fortunately not enough to fool the system. This study thus represents a first step in assessing a speaker verification system against true, informed, impostors.

[1]  Douglas A. Reynolds,et al.  Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..

[2]  Chin-Hui Lee,et al.  Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains , 1994, IEEE Trans. Speech Audio Process..

[3]  M. Wagner,et al.  Vulnerability of speaker verification to voice mimicking , 2004, Proceedings of 2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, 2004..

[4]  Daniel Elenius,et al.  A comparison between human perception and a speaker verification system score of a voice imitation. , 2004 .

[5]  Biing-Hwang Juang,et al.  A vector quantization approach to speaker recognition , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[7]  Guillaume Gravier,et al.  Overview of the 2000-2001 ELISA Consortium research activities , 2001, Odyssey.