Speaker Verification in Noisy Environments Using Gaussian Mixture Models

This chapter explores the behavior of Gaussian Mixture Models (GMMs) for speaker verification in noisy environments. Specifically, the performance of an acoustic modeling framework (namely GMM-UBM) using speaker-dependent GMMs and a speaker-independent Universal Background Model (UBM), is studied for simulated noisy backgrounds. Significance of a feature mapping technique using multiple UBMs for compensating background noise is explored. The speaker verification systems explored in this chapter serve the purpose of baselines considered for comparison and analyzing the performance improvements of the proposed methods in the remaining chapters.

[1]  Srinivasan Umesh,et al.  Use of VTL-wise models in feature-mapping framework to achieve performance of multiple-background models in speaker verification , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[2]  Mark J. F. Gales,et al.  Robust speech recognition in additive and convolutional noise using parallel model combination , 1995, Comput. Speech Lang..

[3]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[4]  S. R. M. Prasanna,et al.  Significance of Vowel-Like Regions for Speaker Verification Under Degraded Conditions , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[5]  Haizhou Li,et al.  An overview of text-independent speaker recognition: From features to supervectors , 2010, Speech Commun..

[6]  Stephan Grashey,et al.  Using a Vocal Tract Length Related Parameter for Speaker Recognition , 2006, 2006 IEEE Odyssey - The Speaker and Language Recognition Workshop.

[7]  Chin-Hui Lee,et al.  Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains , 1994, IEEE Trans. Speech Audio Process..

[8]  Jia Liu,et al.  Multiple Background Models for Speaker Verification , 2010, Odyssey.

[9]  Richard M. Stern,et al.  A vector Taylor series approach for environment-independent speech recognition , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[10]  Douglas A. Reynolds,et al.  Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..

[11]  Li Lee,et al.  A frequency warping approach to speaker normalization , 1998, IEEE Trans. Speech Audio Process..

[12]  Alvin F. Martin,et al.  The DET curve in assessment of detection task performance , 1997, EUROSPEECH.

[13]  Herman J. M. Steeneken,et al.  Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems , 1993, Speech Commun..

[14]  Larry P. Heck,et al.  A model-based transformational approach to robust speaker recognition , 2000, INTERSPEECH.