Sparse Reverberant Audio Source Separation via Reweighted Analysis

We propose a novel algorithm for source signals estimation from an underdetermined convolutive mixture assuming known mixing filters. Most of the state-of-the-art methods are dealing with anechoic or short reverberant mixture, assuming a synthesis sparse prior in the time-frequency domain and a narrowband approximation of the convolutive mixing process. In this paper, we address the source estimation of convolutive mixtures with a new algorithm based on i) an analysis sparse prior, ii) a reweighting scheme so as to increase the sparsity, iii) a wideband data-fidelity term in a constrained form. We show, through theoretical discussions and simulations, that this algorithm is particularly well suited for source separation of realistic reverberation mixtures. Particularly, the proposed algorithm outperforms state-of-the-art methods on reverberant mixtures of audio sources by more than 2 dB of signal-to-distortion ratio on the BSS Oracle dataset.

[1]  Emmanuel Vincent,et al.  Complex Nonconvex l p Norm Minimization for Underdetermined Source Separation , 2007, ICA.

[2]  Lucas C. Parra,et al.  Convolutive blind separation of non-stationary sources , 2000, IEEE Trans. Speech Audio Process..

[3]  Hiroshi Sawada,et al.  Reducing musical noise by a fine-shift overlap-add method applied to source separation using a time-frequency mask , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[4]  Y. Deville,et al.  Time–frequency ratio-based blind separation methods for attenuated and time-delayed sources , 2005 .

[5]  J. D. McEwen,et al.  Sparsity Averaging Reweighted Analysis (SARA): a novel algorithm for radio‐interferometric imaging , 2012, 1205.3123.

[6]  Jean-Philippe Thiran,et al.  Sparsity Averaging for Compressive Imaging , 2012, IEEE Signal Processing Letters.

[7]  Moeness G. Amin,et al.  Blind source separation based on time-frequency signal representations , 1998, IEEE Trans. Signal Process..

[8]  Rémi Gribonval,et al.  A Robust Method to Count and Locate Audio Sources in a Multichannel Underdetermined Mixture , 2010, IEEE Transactions on Signal Processing.

[9]  Michael Zibulevsky,et al.  Underdetermined blind source separation using sparse representations , 2001, Signal Process..

[10]  Michael Elad,et al.  The Cosparse Analysis Model and Algorithms , 2011, ArXiv.

[11]  Mohamed-Jalal Fadili,et al.  Monotone operator splitting for optimization problems in sparse recovery , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[12]  DeLiang Wang,et al.  Speech segregation based on sound localization , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).

[13]  Patrick L. Combettes,et al.  Signal Recovery by Proximal Forward-Backward Splitting , 2005, Multiscale Model. Simul..

[14]  DeLiang Wang,et al.  Two-Microphone Separation of Speech Mixtures , 2008, IEEE Transactions on Neural Networks.

[15]  Nicoleta Roman,et al.  Intelligibility of reverberant noisy speech with ideal binary masking. , 2011, The Journal of the Acoustical Society of America.

[16]  Guo Wei,et al.  Convolutive Blind Source Separation of Non-stationary Source , 2011 .

[17]  Yonina C. Eldar,et al.  Compressed Sensing with Coherent and Redundant Dictionaries , 2010, ArXiv.

[18]  Hiroshi Sawada,et al.  Underdetermined Convolutive Blind Source Separation via Frequency Bin-Wise Clustering and Permutation Alignment , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[19]  Michael A. Saunders,et al.  LSQR: An Algorithm for Sparse Linear Equations and Sparse Least Squares , 1982, TOMS.

[20]  Pierre Vandergheynst,et al.  Nonnegative matrix factorization and spatial covariance model for under-determined reverberant audio source separation , 2010, 10th International Conference on Information Science, Signal Processing and their Applications (ISSPA 2010).

[21]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[22]  Barak A. Pearlmutter,et al.  Independent Component Analysis: Blind source separation by sparse decomposition in a signal dictionary , 2001 .

[23]  Rémi Gribonval,et al.  Under-Determined Reverberant Audio Source Separation Using a Full-Rank Spatial Covariance Model , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[24]  Tim Brookes,et al.  Dynamic Precedence Effect Modeling for Source Separation in Reverberant Environments , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[25]  Emmanuel Vincent,et al.  First Stereo Audio Source Separation Evaluation Campaign: Data, Algorithms and Results , 2007, ICA.

[26]  Rémi Gribonval,et al.  Oracle estimators for the benchmarking of source separation algorithms , 2007, Signal Process..

[27]  Scott Rickard,et al.  Blind separation of speech mixtures via time-frequency masking , 2004, IEEE Transactions on Signal Processing.

[28]  Patrick L. Combettes,et al.  Proximal Splitting Methods in Signal Processing , 2009, Fixed-Point Algorithms for Inverse Problems in Science and Engineering.

[29]  I. Loris On the performance of algorithms for the minimization of ℓ1-penalized functionals , 2007, 0710.4082.

[30]  W. Kellermann,et al.  Wideband algorithms versus narrowband algorithms for adaptive filtering in the DFT domain , 2003, The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003.

[31]  Rémi Gribonval,et al.  A tractable framework for estimating and combining spectral source models for audio source separation , 2012, Signal Process..

[32]  Stephen P. Boyd,et al.  Enhancing Sparsity by Reweighted ℓ1 Minimization , 2007, 0711.1612.

[33]  Barak A. Pearlmutter,et al.  Survey of sparse and non‐sparse methods in source separation , 2005, Int. J. Imaging Syst. Technol..

[34]  Rémi Gribonval,et al.  A wideband doubly-sparse approach for MITO sparse filter estimation , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[35]  Robert D. Nowak,et al.  Toeplitz Compressed Sensing Matrices With Applications to Sparse Channel Estimation , 2010, IEEE Transactions on Information Theory.

[36]  Daniel P. W. Ellis,et al.  Model-Based Expectation-Maximization Source Separation and Localization , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[37]  Michael A. Saunders,et al.  Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[38]  Stephen J. Wright,et al.  Numerical Optimization , 2018, Fundamental Statistical Inference.

[39]  Rémi Gribonval,et al.  Beyond the Narrowband Approximation: Wideband Convex Methods for Under-Determined Reverberant Audio Source Separation , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[40]  Michael Elad,et al.  Analysis versus synthesis in signal priors , 2006 .

[41]  Barak A. Pearlmutter,et al.  Blind Source Separation by Sparse Decomposition in a Signal Dictionary , 2001, Neural Computation.

[42]  J.-C. Pesquet,et al.  A Douglas–Rachford Splitting Approach to Nonsmooth Convex Variational Signal Recovery , 2007, IEEE Journal of Selected Topics in Signal Processing.

[43]  Michael Elad,et al.  Analysis versus synthesis in signal priors , 2006, 2006 14th European Signal Processing Conference.

[44]  Emmanuel Vincent,et al.  The 2008 Signal Separation Evaluation Campaign: A Community-Based Approach to Large-Scale Evaluation , 2009, ICA.

[45]  R. Gribonval,et al.  Proposals for Performance Measurement in Source Separation , 2003 .

[46]  I. Daubechies,et al.  An iterative thresholding algorithm for linear inverse problems with a sparsity constraint , 2003, math/0307152.