Speaker verification in noise using a stochastic version of the weighted Viterbi algorithm

This paper proposes the replacement of the ordinary output probability with its expected value if the addition of noise is modeled as a stochastic process, which in turn is merged with the hidden Markov model (HMM) in the Viterbi algorithm. This new output probability is analytically derived for the generic case of a mixture of Gaussians and can be seen as the definition of a stochastic version of the weighted Viterbi algorithm. Moreover, an analytical expression to estimate the uncertainty in noise canceling is also presented. The method is applied in combination with spectral subtraction to improve the robustness to additive noise of a text-dependent speaker verification system. Reductions as high as 30% or 40% in the error rates and improvements of 50% in the stability of the decision thresholds are reported.

[1]  Mark J. F. Gales,et al.  HMM recognition in noise using parallel model combination , 1993, EUROSPEECH.

[2]  Sridha Sridharan,et al.  Robust speaker identification using multi-microphone systems , 1997, TENCON '97 Brisbane - Australia. Proceedings of IEEE TENCON '97. IEEE Region 10 Annual Conference. Speech and Image Technologies for Computing and Telecommunications (Cat. No.97CH36162).

[3]  Stephen Cox,et al.  Some statistical issues in the comparison of speech recognition algorithms , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[4]  Oded Ghitza Robustness against noise: The role of timing-synchrony measurement , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Hynek Hermansky,et al.  Compensation for the effect of the communication channel in auditory-like analysis of speech (RASTA-PLP) , 1991, EUROSPEECH.

[6]  Richard M. Schwartz,et al.  Enhancement of speech corrupted by acoustic noise , 1979, ICASSP.

[7]  Mark J. F. Gales Predictive model-based compensation schemes for robust speech recognition , 1998, Speech Commun..

[8]  Douglas A. Reynolds,et al.  Experimental evaluation of features for robust speaker identification , 1994, IEEE Trans. Speech Audio Process..

[9]  Mark J. F. Gales,et al.  Robust continuous speech recognition using parallel model combination , 1996, IEEE Trans. Speech Audio Process..

[10]  Biing-Hwang Juang,et al.  Hidden Markov Models for Speech Recognition , 1991 .

[11]  Sadaoki Furui,et al.  Recent advances in speaker recognition , 1997, Pattern Recognit. Lett..

[12]  Mervyn A. Jack,et al.  Weighted Viterbi algorithm and state duration modelling for speech recognition in noise , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[13]  D. Hardt,et al.  Spectral subtraction and RASTA-filtering in text-dependent HMM-based speaker verification , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[14]  Mervyn A. Jack,et al.  Improving performance of spectral subtraction in speech recognition using a model for additive noise , 1998, IEEE Trans. Speech Audio Process..

[15]  S. Furui,et al.  Cepstral analysis technique for automatic speaker verification , 1981 .

[16]  Andrzej Drygajlo,et al.  Speaker verification in noisy environments with combined spectral subtraction and missing feature theory , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[17]  Sarel van Vuuren,et al.  Comparison of text-independent speaker recognition methods on telephone speech with acoustic mismatch , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[18]  Saeed Vaseghi,et al.  Noise compensation methods for hidden Markov model speech recognition in adverse environments , 1997, IEEE Trans. Speech Audio Process..

[19]  Javier Ortega-Garcia,et al.  Providing single and multi-channel acoustical robustness to speaker identification systems , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[20]  John G. Proakis,et al.  Probability, random variables and stochastic processes , 1985, IEEE Trans. Acoust. Speech Signal Process..

[21]  Douglas A. Reynolds,et al.  Integrated models of signal and background with application to speaker identification in noise , 1994, IEEE Trans. Speech Audio Process..

[22]  Bruno O. Shubert,et al.  Random variables and stochastic processes , 1979 .