An upper bound on the quality of artificial bandwidth extension of narrowband speech signals

The aim of the artificial bandwidth extension (BWE) of speech signals is to recover wideband speech from bandlimited speech. As the BWE algorithm is supposed to operate without additional side information on the original wideband speech, it has to exploit mutual dependencies between the available and missing frequency bands of the speech signal. In this paper the BWE is examined from an information theoretic perspective. After defining a performance measure, and introducing a few assumptions on a generalized BWE algorithm, a general relationship between mutual information and the maximum achievable estimation performance is formulated, which ensues an upper bound on the performance of BWE algorithms. Finally, some measurements considering a representative BWE scenario are presented.

[1]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[2]  Hyung Soon Kim,et al.  Narrowband to wideband conversion of speech using GMM based transformation , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[3]  Roar Hagen,et al.  Spectral quantization of cepstral coefficients , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[4]  Peter Jax,et al.  Wideband extension of telephone speech using a hidden Markov model , 2000, 2000 IEEE Workshop on Speech Coding. Proceedings. Meeting the Challenges of the New Millennium (Cat. No.00EX421).

[5]  Douglas D. O'Shaughnessy,et al.  Statistical recovery of wideband speech from narrowband speech , 1992, IEEE Trans. Speech Audio Process..

[6]  Hynek Hermansky,et al.  Beyond NYQUIST: towards the recovery of broad-bandwidth speech from narrow-bandwidth speech , 1995, EUROSPEECH.

[7]  W. Bastiaan Kleijn,et al.  On the mutual information between frequency bands in speech , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[8]  W. Bastiaan Kleijn,et al.  Avoiding over-estimation in bandwidth extension of telephony speech , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[9]  P. Hedelin,et al.  An information theoretic perspective on the speech spectrum process , 2000, 2000 IEEE Workshop on Speech Coding. Proceedings. Meeting the Challenges of the New Millennium (Cat. No.00EX421).

[10]  Hans-Peter Bernhard,et al.  A tight upper bound on the gain of linear and nonlinear predictors for stationary stochastic processes , 1998, IEEE Trans. Signal Process..

[11]  T.H. Crystal,et al.  Linear prediction of speech , 1977, Proceedings of the IEEE.

[12]  Jan Skoglund,et al.  Vector quantization based on Gaussian mixture models , 2000, IEEE Trans. Speech Audio Process..

[13]  Julien Epps,et al.  A new technique for wideband enhancement of coded narrowband speech , 1999, 1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351).

[14]  Saeed Vaseghi Advanced Signal Processing and Digital Noise Reduction , 1996 .