Sparsity and low-rank amplitude based blind Source Separation

This paper presents a new method for blind source separation problem in reverberant environments with more sources than microphones. Based on the sparsity property in the time-frequency domain and the low-rank assumption of the spectrogram of the source, the STRAUSS (SparsiTy and low-Rank AmplitUde based Source Separation) method is developed. Numerical evaluations show that the proposed method outperforms the existing multichannel NMF approaches, while it is exclusively based on amplitude information.

[1]  Michael A. Casey,et al.  Separation of Mixed Audio Sources By Independent Subspace Analysis , 2000, ICMC.

[2]  Shankar Vembu,et al.  Separation of Vocals from Polyphonic Audio Recordings , 2005, ISMIR.

[3]  Pierre Vandergheynst,et al.  Nonnegative matrix factorization and spatial covariance model for under-determined reverberant audio source separation , 2010, 10th International Conference on Information Science, Signal Processing and their Applications (ISSPA 2010).

[4]  Pierre Comon,et al.  Handbook of Blind Source Separation: Independent Component Analysis and Applications , 2010 .

[5]  Andreas Ziehe,et al.  The 2011 Signal Separation Evaluation Campaign (SiSEC2011): - Audio Source Separation - , 2012, LVA/ICA.

[6]  Hiroshi Sawada,et al.  Underdetermined blind separation for speech in real environments with sparseness and ICA , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[7]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[8]  P. Smaragdis,et al.  Non-negative matrix factorization for polyphonic music transcription , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).

[9]  Rémi Gribonval,et al.  Performance measurement in blind audio source separation , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[10]  Scott Rickard,et al.  Blind separation of speech mixtures via time-frequency masking , 2004, IEEE Transactions on Signal Processing.

[11]  C. Févotte Itakura-Saito Nonnegative Factorizations of the Power Spectrogram for Music Signal Decomposition , 2011 .

[12]  Bruno Torrésani,et al.  The Linear Time Frequency Analysis Toolbox , 2012, Int. J. Wavelets Multiresolution Inf. Process..

[13]  Matthieu Kowalski,et al.  An unified approach for blind source separation using sparsity and decorrelation , 2015, 2015 23rd European Signal Processing Conference (EUSIPCO).

[14]  Cédric Févotte,et al.  Majorization-minimization algorithm for smooth Itakura-Saito nonnegative matrix factorization , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[15]  Irfan A. Essa,et al.  Estimating the Spatial Position of Spectral Components in Audio , 2006, ICA.

[16]  Tuomas Virtanen,et al.  Monaural Sound Source Separation by Nonnegative Matrix Factorization With Temporal Continuity and Sparseness Criteria , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[17]  E. Lehmann,et al.  Prediction of energy decay in room impulse responses simulated with an image-source model. , 2008, The Journal of the Acoustical Society of America.

[18]  Alexey Ozerov,et al.  Multichannel Nonnegative Matrix Factorization in Convolutive Mixtures for Audio Source Separation , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[19]  Rémi Gribonval,et al.  Under-Determined Reverberant Audio Source Separation Using a Full-Rank Spatial Covariance Model , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[20]  Tuomas Virtanen,et al.  Separation of drums from polyphonic music using non-negative matrix factorization and support vector machine , 2005, 2005 13th European Signal Processing Conference.

[21]  Hirokazu Kameoka,et al.  Multichannel Extensions of Non-Negative Matrix Factorization With Complex-Valued Data , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[22]  Rémi Gribonval,et al.  A Robust Method to Count and Locate Audio Sources in a Multichannel Underdetermined Mixture , 2010, IEEE Transactions on Signal Processing.

[23]  Shlomo Dubnov Extracting Sound Objects by Independent Subspace Analysis , 2002 .

[24]  D. Fitzgerald,et al.  Non-negative Tensor Factorisation for Sound Source Separation , 2005 .

[25]  Christian Rohlfing,et al.  Complex SVD Initialization for NMF Source Separation on Audio Spectrograms , 2015 .

[26]  Pierre Vandergheynst,et al.  Reverberant Audio Source Separation via Sparse and Low-Rank Modeling , 2013, IEEE Signal Processing Letters.