Speech enhancement based on approximate message passing

To overcome the limitations of conventional speech enhancement methods, such as inaccurate voice activity detector (VAD) and noise estimation, a novel speech enhancement algorithm based on the approximate message passing (AMP) is adopted. AMP exploits the difference between speech and noise sparsity to remove or mute the noise from the corrupted speech. The AMP algorithm is adopted to reconstruct the clean speech efficiently for speech enhancement. More specifically, the prior probability distribution of speech sparsity coefficient is characterized by Gaussian-model, and the hyper-parameters of the prior model are excellently learned by expectation maximization (EM) algorithm. We utilize the k-nearest neighbor (k-NN) algorithm to learn the sparsity with the fact that the speech coefficients between adjacent frames are correlated. In addition, computational simulations are used to validate the proposed algorithm, which achieves better speech enhancement performance than other four baseline methods-Wiener filtering, subspace pursuit (SP), distributed sparsity adaptive matching pursuit (DSAMP), and expectation-maximization Gaussian-model approximate message passing (EM-GAMP) under different compression ratios and a wide range of signal to noise ratios (SNRs).

[1]  Geoffrey E. Hinton,et al.  A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.

[2]  Wei-Ping Zhu,et al.  Compressive sensing-based speech enhancement in non-sparse noisy environments , 2013, IET Signal Process..

[3]  Yang Song,et al.  IKNN: Informative K-Nearest Neighbor Pattern Classification , 2007, PKDD.

[4]  Olgica Milenkovic,et al.  Subspace Pursuit for Compressive Sensing Signal Reconstruction , 2008, IEEE Transactions on Information Theory.

[5]  Philip Schniter,et al.  Compressive Imaging Using Approximate Message Passing and a Markov-Tree Prior , 2010, IEEE Transactions on Signal Processing.

[6]  Emmanuel Vincent,et al.  A Consolidated Perspective on Multimicrophone Speech Enhancement and Source Separation , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[7]  S. Frick,et al.  Compressed Sensing , 2014, Computer Vision, A Reference Guide.

[8]  Qinghua Guo,et al.  Sparse Bayesian Learning Using Approximate Message Passing with Unitary Transformation , 2019, ArXiv.

[9]  Sharon Gannot,et al.  A Bayesian Hierarchical Model for Speech Enhancement With Time-Varying Audio Channel , 2019, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[10]  Hongwu Yang,et al.  Speech enhancement using orthogonal matching pursuit algorithm , 2014, 2014 International Conference on Orange Technologies.

[11]  Hai Huyen Dam,et al.  Effective Binaural Multi-Channel Processing Algorithm for Improved Environmental Presence , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[12]  Satoshi Nakamura,et al.  Speech enhancement based on the subspace method , 2000, IEEE Trans. Speech Audio Process..

[13]  Bhaskar D. Rao,et al.  Sparse Bayesian learning using approximate message passing , 2014, 2014 48th Asilomar Conference on Signals, Systems and Computers.

[14]  Marc Moonen,et al.  Robust Speech-Distortion Weighted Interframe Wiener Filters for Single-Channel Noise Reduction , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[15]  Jianhua Lu,et al.  Approximate Message Passing with Nearest Neighbor Sparsity Pattern Learning , 2016, ArXiv.

[16]  Bhaskar D. Rao,et al.  A GAMP-Based Low Complexity Sparse Bayesian Learning Algorithm , 2017, IEEE Transactions on Signal Processing.

[17]  Timo Gerkmann Bayesian Estimation of Clean Speech Spectral Coefficients Given a Priori Knowledge of the Phase , 2014, IEEE Transactions on Signal Processing.

[18]  Svetha Venkatesh,et al.  Compressive speech enhancement , 2013, Speech Commun..

[19]  Sheng Wu,et al.  Efficient recovery of structured sparse signals via approximate message passing with structured spike and slab prior , 2018, China Communications.

[20]  Andrea Montanari,et al.  The dynamics of message passing on dense graphs, with applications to compressed sensing , 2010, 2010 IEEE International Symposium on Information Theory.

[21]  Kuldip K. Paliwal,et al.  Speech enhancement using a minimum mean-square error short-time spectral modulation magnitude estimator , 2012, Speech Commun..

[22]  Akinori Nishihara,et al.  Two-microphone subband noise reduction scheme with a new noise subtraction parameter for speech quality enhancement , 2015, IET Signal Process..

[23]  Ljubiša Stanković,et al.  Analysis of the Reconstruction of Sparse Signals in the DCT Domain Applied to Audio Signals , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[24]  Jun Fang,et al.  Computationally efficient sparse Bayesian learning via generalized approximate message passing , 2015, 2016 IEEE International Conference on Ubiquitous Wireless Broadband (ICUWB).

[25]  Kuldip K. Paliwal,et al.  Single-channel speech enhancement using spectral subtraction in the short-time modulation domain , 2010, Speech Commun..

[26]  Jae S. Lim,et al.  Speech enhancement , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[27]  Haipeng Yao,et al.  Downlink Channel Estimation for Massive MIMO Systems Relying on Vector Approximate Message Passing , 2019, IEEE Transactions on Vehicular Technology.

[28]  Ching-Ta Lu,et al.  Enhancement of single channel speech using perceptual-decision-directed approach , 2011, Speech Commun..

[29]  Francesco Piazza,et al.  Comparative Evaluation of Single-Channel MMSE-Based Noise Reduction Schemes for Speech Recognition , 2010, J. Electr. Comput. Eng..

[30]  Andrea Montanari,et al.  Graphical Models Concepts in Compressed Sensing , 2010, Compressed Sensing.

[31]  Zhiwei Xu,et al.  Adaptive one-bit quantisation via approximate message passing with nearest neighbour sparsity pattern learning , 2018, IET Signal Process..

[32]  Trac D. Tran,et al.  Distributed Compressed Video Sensing , 2009, 2009 43rd Annual Conference on Information Sciences and Systems.

[33]  Chung-Hsien Wu,et al.  Compressive Sensing-Based Speech Enhancement , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[34]  Wei-Ping Zhu,et al.  The Theory of Compressive Sensing Matching Pursuit Considering Time-domain Noise with Application to Speech Enhancement , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[35]  Simon J. Godsill,et al.  Sparse Linear Regression With Structured Priors and Application to Denoising of Musical Audio , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[36]  Yong HongWu Speech enhancement based on compressive sensing , 2011 .

[37]  Emanuel A. P. Habets,et al.  Multispeaker LCMV Beamformer and Postfilter for Source Separation and Noise Reduction , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[38]  Hongbin Li,et al.  Pattern-Coupled Sparse Bayesian Learning for Recovery of Block-Sparse Signals , 2013, IEEE Transactions on Signal Processing.

[39]  Maher K. Mahmood Al-Azawi,et al.  Combined speech compression and encryption using chaotic compressive sensing with large key size , 2018, IET Signal Process..