论文信息 - An improved sparse reconstruction algorithm for speech compressive sensing using structured priors

An improved sparse reconstruction algorithm for speech compressive sensing using structured priors

This work addresses the issue of sparse reconstruction in compressive sensing (CS) for speech signals. We propose a novel sparse reconstruction algorithm based on the approximate message passing (AMP) framework, via exploiting the intrinsic structures of real-life speech signals in the modified discrete cosine transform (MDCT) domain. We use a Gaussian mixture model to characterize the marginal distribution of the MDCT coefficients, and employ a first order Markov chain model to capture the inter-dependencies between neighboring MDCT coefficients. The parameters of these two models are adaptively learned using an expectation-maximization (EM) learning procedure. Compared with several state-of-the-art algorithms, the new algorithm showed significantly better performance in reconstruction experiments on real speech signals.

[1] Armando Manduca,et al. Highly Undersampled Magnetic Resonance Image Reconstruction via Homotopic $\ell_{0}$ -Minimization , 2009, IEEE Transactions on Medical Imaging.

[2] E.J. Candes,et al. An Introduction To Compressive Sampling , 2008, IEEE Signal Processing Magazine.

[3] Brendan J. Frey,et al. A Revolution: Belief Propagation in Graphs with Cycles , 1997, NIPS.

[4] DeLiang Wang,et al. Segregation of unvoiced speech from nonspeech interference. , 2008, The Journal of the Acoustical Society of America.

[5] Andrea Montanari,et al. Message-passing algorithms for compressed sensing , 2009, Proceedings of the National Academy of Sciences.

[6] Sundeep Rangan,et al. Generalized approximate message passing for estimation with random linear mixing , 2010, 2011 IEEE International Symposium on Information Theory Proceedings.

[7] Simon J. Godsill,et al. Sparse Linear Regression With Structured Priors and Application to Denoising of Musical Audio , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[8] Kush R. Varshney,et al. Sparse Representation in Structured Dictionaries With Application to Synthetic Aperture Radar , 2008, IEEE Transactions on Signal Processing.

[9] Emmanuel J. Candès,et al. Decoding by linear programming , 2005, IEEE Transactions on Information Theory.

[10] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[11] Jyh-Shing Roger Jang,et al. On the Improvement of Singing Voice Separation for Monaural Recordings Using the MIR-1K Dataset , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[12] Marc Teboulle,et al. A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[13] Philip Schniter,et al. Turbo reconstruction of structured sparse signals , 2010, 2010 44th Annual Conference on Information Sciences and Systems (CISS).

[14] Volkan Cevher,et al. Model-Based Compressive Sensing , 2008, IEEE Transactions on Information Theory.

[15] Tomoki Toda,et al. Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[16] Philippe Gournay,et al. Unified speech and audio coding scheme for high quality at low bitrates , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[17] Philip Schniter,et al. Dynamic Compressive Sensing of Time-Varying Signals Via Approximate Message Passing , 2012, IEEE Transactions on Signal Processing.

[18] X. Jin. Factor graphs and the Sum-Product Algorithm , 2002 .

[19] Donald B. Rubin,et al. Max-imum Likelihood from Incomplete Data , 1972 .