Influence of Natural Voice Disguise Techniques on Automatic Speaker Recognition

The problem of voice disguise is usually investigated in the context of surveillance and forensics. The biometric techniques applied for automatic or subj ective speaker recognition can be deliberately or non-deliberately misled by technical or natural methods. The investigations presented in this paper include data collection and automatic speaker recognition scores. The database consists of the utterances of several natural voice disguise techniques: phonation (raised and lowered pitch, whisper), phonemic (foreign accent), prosodic (speech tempo) and deformation (pinched nostrils and clenched jaws). Speaker verification was carried out with the state-of-the-art system of MFCC (Mel Frequency Cepstral Coefficients) feature extraction and GMM (Gaussian Mixture Models) classification.

[1]  Elizabeth Shriberg,et al.  A Study of Intentional Voice Modifications for Evading Automatic Speaker Recognition , 2006, 2006 IEEE Odyssey - The Speaker and Language Recognition Workshop.

[2]  Piotr STARONIEWICZ,et al.  Tests of robustness of GMM speaker verification in VoIP telephony , 2007 .

[3]  Nicholas W. D. Evans,et al.  Evasion and obfuscation in speaker recognition surveillance and forensics , 2014, 2nd International Workshop on Biometrics and Forensics.

[4]  R. Rodman,et al.  Computer Recognition of Speakers Who Disguise Their Voice , 2000 .

[5]  Piotr Staroniewicz,et al.  Considering basic emotional state information in speaker verification , 2016, 2016 4th International Conference on Biometrics and Forensics (IWBF).

[6]  Douglas A. Reynolds,et al.  A Tutorial on Text-Independent Speaker Verification , 2004, EURASIP J. Adv. Signal Process..

[7]  Gérard Chollet,et al.  Voice Disguise and Automatic Detection: Review and Perspectives , 2005, WNSP.

[8]  Javier Ortega-Garcia,et al.  Effect of voice disguise on the performance of a forensic automatic speaker recognition system , 2004, Odyssey.

[9]  Urszula Jorasz Celebrations of 50 years of the Acoustic Centre at the Adam Mickiewicz University in Poznań , 2007 .

[10]  Cuiling Zhang,et al.  Voice disguise and automatic speaker recognition. , 2008, Forensic science international.

[11]  Wojciech Majewski,et al.  Imitation of Target Speakers by Different Types of Impersonators , 2010, COST 2102 Conference.

[12]  Andrzej Dobrucki,et al.  Test signals used in electroacoustics and speech technology , 2017, 2017 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA).

[14]  Alvin F. Martin,et al.  The DET curve in assessment of detection task performance , 1997, EUROSPEECH.