Feature selection for improved bandwidth extension of speech signals

The aim of artificial bandwidth extension (BWE) is to convert speech signals with "standard telephone" quality (frequencies up to 3.4 kHz) into 7 kHz wideband speech. The principal key to high quality BWE is the estimation of the spectral envelope of the wideband speech. In general, this estimation of the wideband spectral envelope is based on a number of features that are extracted from the narrowband input speech signal. We investigate potential features and evaluate their suitability for the BWE application. The quality of each feature is quantified in terms of the statistical measures of mutual information and separability. It turns out that the best BWE results are obtained by using a large feature "super-vector" (/spl rarr/ high mutual information) which is subsequently reduced in dimension by a linear discriminant analysis (/spl rarr/ large separability). This solution also helps to reduce the computational complexity of the estimation of the wideband spectral envelope.

[1]  Peter Jax,et al.  An upper bound on the quality of artificial bandwidth extension of narrowband speech signals , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  Jan Skoglund,et al.  Vector quantization based on Gaussian mixture models , 2000, IEEE Trans. Speech Audio Process..

[3]  J. W. Paulus,et al.  Variable Bitrate Wideband Speech Coding Using Perceptually Motivated Thresholds , 1995, Proceedings. IEEE Workshop on Speech Coding for Telecommunications.

[4]  Peter Jax,et al.  On artificial bandwidth extension of telephone speech , 2003, Signal Process..

[5]  Peter J. Patrick Enhancement of band-limited speech signals , 1983 .

[6]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[7]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[8]  W. Bastiaan Kleijn,et al.  Gaussian mixture model based mutual information estimation between frequency bands in speech , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9]  Douglas D. O'Shaughnessy,et al.  Statistical recovery of wideband speech from narrowband speech , 1992, IEEE Trans. Speech Audio Process..

[10]  Peter Jax,et al.  Artificial bandwidth extension of speech signals using MMSE estimation based on a hidden Markov model , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..