Sparse probabilistic state mapping and its application to speech bandwidth expansion

In this paper we present a probabilistic algorithm that extracts a mapping between two subspaces by representing each subspace as a collection of states. An arbitrary increase in number of states results in over-fitting the training data without exploring the underlying structure of the map. This paper suggests a method to impose sparsity constraints on the state map by using entropic priors. This probabilistic model is applied to the problem of artificial bandwidth expansion that involves estimating the missing frequency components (3.7 – 8 kHz and 0 – 0.3 kHz) of speech given the narrowband speech signal (0.3 – 3.7 kHz).

[1]  H. Yasukawa Restoration of wide band signal from telephone speech using linear prediction residual error filtering , 1996, 1996 IEEE Digital Signal Processing Workshop Proceedings.

[2]  Patrik O. Hoyer,et al.  Non-negative Matrix Factorization with Sparseness Constraints , 2004, J. Mach. Learn. Res..

[3]  K. Kalgaonkar,et al.  Vocal tract area based artificial bandwidth extension , 2008, 2008 IEEE Workshop on Machine Learning for Signal Processing.

[4]  Matthew Brand,et al.  Pattern discovery via entropy minimization , 1999, AISTATS.

[5]  S. Voran Listener ratings of speech passbands , 1997, 1997 IEEE Workshop on Speech Coding for Telecommunications Proceedings. Back to Basics: Attacking Fundamental Problems in Speech Coding.

[6]  Hynek Hermansky,et al.  Beyond NYQUIST: towards the recovery of broad-bandwidth speech from narrow-bandwidth speech , 1995, EUROSPEECH.

[7]  Peter Jax,et al.  Artificial bandwidth extension of speech signals using MMSE estimation based on a hidden Markov model , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[8]  Takayuki Nagai,et al.  Speech signal band width extension and noise removal using subband HMN , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9]  Mark A. Clements,et al.  Vocal tract area based formant tracking using particle filter , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[10]  Masanobu Abe,et al.  An algorithm to reconstruct wideband speech from narrowband speech based on codebook mapping , 1994, ICSLP.

[11]  G. Miet,et al.  Speech enhancement via frequency bandwidth extension using line spectral frequencies , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).