Linear Dynamic Models With Mixture of Experts Architecture for Recognition of Speech Under Additive Noise Conditions

This letter presents a new approach to enhance speech feature estimation in the log spectral domain under noisy environments. A mixture of linear dynamic models with an architecture similar to the so-called mixture of experts (ME) is investigated to describe the clean speech feature distribution parametrically. Switching Kalman filters are adapted to the proposed model, and they estimate the clean speech components by means of a generalized pseudo-Bayesian (GPB) algorithm. Experimental results suggest that compared with previous methods, the proposed approach can be more powerful to compensate the noisy speech features for robust speech recognition

[1]  Henry Leung,et al.  A multiple-model prediction approach for sea clutter modeling , 2003, IEEE Trans. Geosci. Remote. Sens..

[2]  Alex Acero,et al.  Noise robust speech recognition with a switching linear dynamic model , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Ashok N. Srivastava,et al.  Nonlinear gated experts for time series: discovering regimes and avoiding overfitting , 1995, Int. J. Neural Syst..

[4]  Michael I. Jordan,et al.  Convergence results for the EM approach to mixtures of experts architectures , 1995, Neural Networks.

[5]  Nam Soo Kim IMM-based estimation for slowly evolving environments , 1998, IEEE Signal Processing Letters.

[6]  John Moody,et al.  Fast Learning in Networks of Locally-Tuned Processing Units , 1989, Neural Computation.

[7]  Kevin P. Murphy Fitting a Conditional Linear Gaussian Distribution , 2003 .

[8]  Richard M. Stern,et al.  Feature compensation based on switching linear dynamic model , 2005, IEEE Signal Processing Letters.

[9]  Li Deng,et al.  Estimating cepstrum of speech under the presence of noise using a joint prior of static and dynamic features , 2004, IEEE Transactions on Speech and Audio Processing.

[10]  Jerry D. Gibson,et al.  Filtering of colored noise for speech enhancement and coding , 1991, IEEE Trans. Signal Process..

[11]  Michael O. Kolawole,et al.  Estimation and tracking , 2002 .

[12]  Saeed Vaseghi,et al.  Speech recognition in noisy environments , 1992, ICSLP.