A sparsity-relaxed algorithm for the under-determined convolutive blind source separation

Convolutive blind source separation (CBSS) is a kind of signal processing method by separating multiple sources from a convolutive mixing model. The concept of CBSS is to recover the latent sources in a reverberant environment. Usually, a two-stage scheme including the mixing matrix estimation and the source recovery are proposed to fulfill this target. In this paper, we mainly discuss the source recovery problem based on the knowledge of estimated mixing matrix. Specifically, this problem can be categorized as a sparse source construction optimization model, especially for the under-determined case where the number of sources is greater than the number of microphones. Inspirited by the fact that only few source components are active at each time-frequency slot, a new augmented Lagrange method is proposed to find the optimal sparse solution of sources with the ℓp norm (0<p<1) based measurement function. The proposed method relaxes the strict sparse assumption on sources, hence improve the source separation performance. The experiment results demonstrate that the proposed algorithm is superior than the state-of-the-art methods.

[1]  DeLiang Wang,et al.  Time-Frequency Masking in the Complex Domain for Speech Dereverberation and Denoising , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[2]  Alexey Ozerov,et al.  Multichannel Nonnegative Matrix Factorization in Convolutive Mixtures for Audio Source Separation , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[3]  Scott Rickard,et al.  Blind separation of speech mixtures via time-frequency masking , 2004, IEEE Transactions on Signal Processing.

[4]  V. G. Reju,et al.  Underdetermined Convolutive Blind Source Separation via Time–Frequency Masking , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[5]  Nikos D. Sidiropoulos,et al.  Batch and Adaptive PARAFAC-Based Blind Separation of Convolutive Speech Mixtures , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[6]  Jont B. Allen,et al.  Image method for efficiently simulating small‐room acoustics , 1976 .

[7]  C. Serviere,et al.  Blind source separation of convolutive mixtures , 1996, Proceedings of 8th Workshop on Statistical Signal and Array Processing.

[8]  Roberto Merletti,et al.  Blind Source Separation. Application to Biomedical Signals , 2011 .

[9]  Zhaoshui He,et al.  Convolutive Blind Source Separation in the Frequency Domain Based on Sparse Representation , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[10]  Nikos D. Sidiropoulos,et al.  Adaptive Algorithms to Track the PARAFAC Decomposition of a Third-Order Tensor , 2009, IEEE Transactions on Signal Processing.

[11]  Damjan Zazula,et al.  Surface EMG Decomposition Using a Novel Approach for Blind Source Separation , 2003 .

[12]  James P. Reilly,et al.  A frequency domain method for blind source separation of convolutive audio mixtures , 2005, IEEE Transactions on Speech and Audio Processing.

[13]  Hiroshi Sawada,et al.  A robust and precise method for solving the permutation problem of frequency-domain blind source separation , 2004, IEEE Transactions on Speech and Audio Processing.

[14]  James P. Reilly,et al.  Blind identification of MIMO FIR systems driven by quasistationary sources using second-order statistics: a frequency domain approach , 2004, IEEE Transactions on Signal Processing.

[15]  Emmanuel Vincent,et al.  First Stereo Audio Source Separation Evaluation Campaign: Data, Algorithms and Results , 2007, ICA.