Artificial bandwidth extension of speech signals using MMSE estimation based on a hidden Markov model

We present an algorithm to derive 7 kHz wideband speech from narrowband "telephone speech". A statistical approach is used that is based on a hidden Markov model (HMM) of the speech production process. A new method for the estimation of the wideband spectral envelope is proposed, using nonlinear state-specific techniques to minimize a mean square error criterion. In contrast to common memoryless estimation methods, additional information from adjacent signal frames can be exploited by utilizing the HMM. A consistent advantage of the new estimation rule is obtained compared to previously published HMM-based hard or soft classification methods.

[1]  W. Bastiaan Kleijn,et al.  Gaussian mixture model based mutual information estimation between frequency bands in speech , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  R. Strawderman Continuous Multivariate Distributions, Volume 1: Models and Applications , 2001 .

[3]  Julien Epps,et al.  A new technique for wideband enhancement of coded narrowband speech , 1999, 1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351).

[4]  Willem Bastiaan Kleijn,et al.  Bandwidth expansion of speech based on vector quantization of the mel frequency cepstral coefficients , 1999, 1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351).

[5]  William R. Gardner,et al.  Techniques for The Regeneration of Wideband Speech from Narrowband Speech , 2001, EURASIP J. Adv. Signal Process..

[6]  Hynek Hermansky,et al.  Beyond NYQUIST: towards the recovery of broad-bandwidth speech from narrow-bandwidth speech , 1995, EUROSPEECH.

[7]  Peter Jax,et al.  On artificial bandwidth extension of telephone speech , 2003, Signal Process..

[8]  G. Miet,et al.  Speech enhancement via frequency bandwidth extension using line spectral frequencies , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[9]  Peter Jax,et al.  An upper bound on the quality of artificial bandwidth extension of narrowband speech signals , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[10]  Peter Jax,et al.  Wideband extension of telephone speech using a hidden Markov model , 2000, 2000 IEEE Workshop on Speech Coding. Proceedings. Meeting the Challenges of the New Millennium (Cat. No.00EX421).

[11]  Ulrich Kornagel Spectral widening of the excitation signal for telephone-band speech enhancement , 2001 .

[12]  Douglas D. O'Shaughnessy,et al.  Statistical recovery of wideband speech from narrowband speech , 1992, IEEE Trans. Speech Audio Process..

[13]  Hyung Soon Kim,et al.  Narrowband to wideband conversion of speech using GMM based transformation , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[14]  Saeed Vaseghi Advanced Signal Processing and Digital Noise Reduction , 1996 .

[15]  S. Voran Listener ratings of speech passbands , 1997, 1997 IEEE Workshop on Speech Coding for Telecommunications Proceedings. Back to Basics: Attacking Fundamental Problems in Speech Coding.