Robust speaker verification with a two classifier format and feature enhancement

In the presence of environmental noise, speaker verification systems inevitably see a decrease in performance. This paper proposes the (1) use of two parallel classifiers, (2) feature enhancement based on blind signal-to-noise ratio (SNR) estimation and (3) fusion, to improve the performance of speaker verification systems. The two classifiers are based on Gaussian mixture models and the partial least-squares technique. Speech corrupted by additive noise at SNRs from 0 to 30 dB are used for authentication. A two-way analysis of variance validates the performance gain offered by the methods used. The outputs of the classifiers are fused together in different ways. The fusion method where the scores of the classifiers are added together is found to be the best method again using statistical analysis.

[1]  James R. Glass,et al.  Robust Speaker Recognition in Noisy Conditions , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[2]  Haizhou Li,et al.  An overview of text-independent speaker recognition: From features to supervectors , 2010, Speech Commun..

[3]  Steven van de Par,et al.  Noise-Robust Speaker Recognition Combining Missing Data Techniques and Universal Background Modeling , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[4]  Florin Rastoceanu,et al.  Score fusion methods for text-independent speaker verification applications , 2011, 2011 6th Conference on Speech Technology and Human-Computer Dialogue (SpeD).

[5]  S. D. Jong SIMPLS: an alternative approach to partial least squares regression , 1993 .

[6]  Balaji Vasan Srinivasan,et al.  A partial least squares framework for speaker recognition , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[7]  Shantanu Chakrabartty,et al.  An Overview of Statistical Pattern Recognition Techniques for Speaker Verification , 2011, IEEE Circuits and Systems Magazine.

[8]  Ravi P. Ramachandran,et al.  Robust speaker identification under noisy conditions using feature compensation and signal to noise ratio estimation , 2016, 2016 IEEE 59th International Midwest Symposium on Circuits and Systems (MWSCAS).

[9]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[10]  Ravi P. Ramachandran,et al.  Blind Signal-to-Noise Ratio Estimation of Speech Based on Vector Quantizer Classifiers and Decision Level Fusion , 2017, J. Signal Process. Syst..

[11]  John H. L. Hansen,et al.  A Study on Universal Background Model Training in Speaker Verification , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[12]  Jay L. Devore,et al.  Probability and statistics for engineering and the sciences , 1982 .