Model-based approaches to adaptive training in reverberant environments

Adaptive training is a powerful approach for building speech recognition systems using non-homogeneous data. This work presents an extension of model-based adaptive training to handle reverberant environments. The recently proposed Reverberant VTS-Joint (RVTSJ) adaptation[1] is used to factor out unwanted additive and reverberant noise variations in multiconditional training data, yielding a canonical model neutral to noise conditions. An maximum likelihood estimation of the canonical model parameters is described. An initialisation scheme that uses the VTS-based adaptive training to initialise the model parameters is also presented. Experiments are conducted on a reverberant simulated AURORA4 task.

[1]  Yifan Gong,et al.  High-performance hmm adaptation with joint compensation of additive and convolutive distortions via Vector Taylor Series , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).

[2]  Y.-Q. Wang,et al.  Model-based approaches to handling additive noise in reverberant environments , 2011, 2011 Joint Workshop on Hands-free Speech Communication and Microphone Arrays.

[3]  Alex Acero,et al.  Noise adaptive training using a vector taylor series approach for noise robust automatic speech recognition , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[4]  Hank Liao,et al.  Joint uncertainty decoding for robust large vocabulary speech recognition , 2006 .

[5]  Hans-Günter Hirsch,et al.  The simulation of realistic acoustic input scenarios for speech recognition systems , 2005, INTERSPEECH.

[6]  Maurizio Omologo,et al.  Hidden Markov model training with contaminated speech material for distant-talking speech recognition , 2002, Comput. Speech Lang..

[7]  Richard M. Schwartz,et al.  A compact model for speaker-adaptive training , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[8]  Roland Maas,et al.  Multi-style training of HMMS with stereo data for reverberation-robust speech recognition , 2011, 2011 Joint Workshop on Hands-free Speech Communication and Microphone Arrays.

[9]  Yongqiang Wang,et al.  Improving reverberant VTS for hands-free robust speech recognition , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.