A fast approach to psychoacoustic model compensation for robust speaker recognition in additive noise

This paper addresses the problem of speaker verification in the presence of additive noise. We propose a fast implementation of Psychoacoustic Model Compensation (Psy-Comp) scheme for static features along with model domain mean and variance normalization for robust speaker recognition in noisy conditions. The proposed algorithms are validated through experiments on noise corrupted NIST-2000 speaker recognition database. We show that the Psy-Comp scheme along with model domain mean and variance normalization provide significant performance gain compared to the Vector Taylor Series (VTS) scheme and feature domain cepstral mean and variance normalization scheme. Moreover, the computational cost of the proposed method is significantly less than the VTS scheme.

[1]  Douglas A. Reynolds,et al.  Speaker identification and verification using Gaussian mixture speaker models , 1995, Speech Commun..

[2]  Yun Lei,et al.  A noise robust i-vector extractor using vector taylor series for speaker recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[3]  Ashish Panda Psychoacoustic model compensation with robust feature set for speaker verification in additive noise , 2014, The 9th International Symposium on Chinese Spoken Language Processing.

[4]  Srinivasan Umesh,et al.  Improved cepstral mean and variance normalization using Bayesian framework , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.

[5]  Yun Lei,et al.  Simplified VTS-based I-vector extraction in noise-robust speaker recognition , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[6]  Patrick Kenny,et al.  Front-End Factor Analysis for Speaker Verification , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[8]  Philip C. Loizou Speaker Verification in Noise Using a Stochastic Version of the Weighted Viterbi Algorithm , 2002 .

[9]  Yifan Gong,et al.  High-performance hmm adaptation with joint compensation of additive and convolutive distortions via Vector Taylor Series , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).

[10]  James R. Glass,et al.  Robust Speaker Recognition in Noisy Conditions , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[11]  Martin J. Russell,et al.  Text-dependent speaker verification under noisy conditions using parallel model combination , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[12]  Thambipillai Srikanthan,et al.  Psychoacoustic Model Compensation for Robust Speaker Verification in Environmental Noise , 2012, IEEE Transactions on Audio, Speech, and Language Processing.