Speech enhancement based on extended Kalman filter and neural predictive hidden Markov model

To represent the nonlinear and nonstationarity nature of speech, we assume that speech is the output of an NPHMM combining a neural network and hidden Markov model (HMM). The NPHMM is a nonlinear autoregressive process whose time-varying parameters are controlled by a hidden Markov chain. Given some speech data for training, the parameter of NPHMM is estimated by a learning algorithm based on the combination of Baum-Welch's algorithm and a neural network learning algorithm using the backpropagation algorithm. A recursive method using a extended Kalman filter with the parameters of a trained NPHMM is developed for enhancing speech signals degraded by statistically independent additive noise assumed to be white Gaussian. Our recursive speech enhancement method achieves an improvement over the method hidden filter model of about 1-1.5 dB in SNR.

[1]  B. Anderson,et al.  Optimal Filtering , 1979, IEEE Transactions on Systems, Man, and Cybernetics.

[2]  Yong Lee Ki,et al.  Recursive Estimation for Speech Enhancement using the Hidden Filter Model , 1995 .

[3]  Esther Levin Hidden control neural architecture modeling of nonlinear time varying systems and its applications , 1993, IEEE Trans. Neural Networks.

[4]  Kuldip K. Paliwal,et al.  A speech enhancement method based on Kalman filtering , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Yariv Ephraim,et al.  A Bayesian estimation approach for speech enhancement using hidden Markov models , 1992, IEEE Trans. Signal Process..

[6]  Les E. Atlas,et al.  Recurrent neural networks and robust time series prediction , 1994, IEEE Trans. Neural Networks.

[7]  C.E. Mokbel,et al.  Automatic word recognition in cars , 1995, IEEE Trans. Speech Audio Process..

[8]  Lizhong Wu,et al.  On the design of nonlinear speech predictors with recurrent nets , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[9]  Yariv Ephraim,et al.  Speech enhancement using state dependent dynamical system model , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[10]  Yoshihiro Takada,et al.  Neural Predictive Hidden Markov Model for Speech Recognition , 1995, IEICE Trans. Inf. Syst..

[11]  Lizhong Wu,et al.  Fully vector-quantized neural network-based code-excited nonlinear predictive speech coding , 1994, IEEE Trans. Speech Audio Process..