Stereo-based stochastic noise compensation based on trajectory GMMS

This paper proposes a novel stereo-based stochastic noise compensation technique based on trajectory GMMs. Although the GMM-based noise compensation techniques such as SPLICE work effective, their performance sometimes degrades due to the inappropriate dynamic characteristics caused by the frame-by-frame mapping. While the use of dynamic feature constraints on the mapping stage can alleviate this problem, it also introduces an inconsistency between training and mapping. The recently proposed trajectory GMM-based feature mapping technique can solve this inconsistency while keeping the benefits of the use of dynamic features, and offers an entire sequence-level transformation rather than the frame-by-frame mapping. Results from a noise compensation experiment on the AURORA-2 task show that the proposed trajectory GMM-based noise compensation technique outperforms the conventional ones.

[1]  Heiga Zen,et al.  Reformulating the HMM as a trajectory model by imposing explicit relationships between static and dynamic feature vector sequences , 2007, Comput. Speech Lang..

[2]  Li Deng,et al.  Uncertainty decoding with SPLICE for noise robust speech recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Xiaodong Cui,et al.  MMSE-based stereo feature stochastic mapping for noise robust speech recognition , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[4]  Heiga Zen,et al.  Reformulating the HMM as a Trajectory Model , 2004 .

[5]  全 炳河,et al.  Reformulating HMM as a trajectory model by imposing explicit relationships between static and dynamic features , 2006 .

[6]  David Pearce,et al.  The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions , 2000, INTERSPEECH.

[7]  Heiga Zen,et al.  Probabilistic feature mapping based on trajectory HMMs , 2008, INTERSPEECH.

[8]  Tomoki Toda,et al.  Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[9]  Li Deng,et al.  High-performance robust speech recognition using stereo training data , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).