Underdetermined Blind Audio Source Separation Using Modal Decomposition

This paper introduces new algorithms for the blind separation of audio sources using modal decomposition. Indeed, audio signals and, in particular, musical signals can be well approximated by a sum of damped sinusoidal (modal) components. Based on this representation, we propose a two-step approach consisting of a signal analysis (extraction of the modal components) followed by a signal synthesis (grouping of the components belonging to the same source) using vector clustering. For the signal analysis, two existing algorithms are considered and compared: namely the EMD (empirical mode decomposition) algorithm and a parametric estimation algorithm using ESPRIT technique. A major advantage of the proposed method resides in its validity for both instantaneous and convolutive mixtures and its ability to separate more sources than sensors. Simulation results are given to compare and assess the performance of the proposed algorithms.

[1]  Thomas Kailath,et al.  Detection of signals by information theoretic criteria , 1985, IEEE Trans. Acoust. Speech Signal Process..

[2]  Karim Abed-Meraim,et al.  Blind separation of audio sources using modal decomposition , 2005, Proceedings of the Eighth International Symposium on Signal Processing and Its Applications, 2005..

[3]  Cn Canagarajah,et al.  Underdetermined blind separation using learned basis function sets , 2003 .

[4]  Fabian J. Theis,et al.  Sparse component analysis and blind source separation of underdetermined mixtures , 2005, IEEE Transactions on Neural Networks.

[5]  P. Flandrin,et al.  Empirical Mode Decomposition , 2012 .

[6]  Muhammad Ikram Blind separation of delayed instantaneous mixtures: a cross‐correlation based approach , 2004 .

[7]  Yves Grenier,et al.  Unsupervised Classification Techniques for Multipitch Estimation , 2004 .

[8]  Patrick A. Naylor,et al.  Proportionate Frequency Domain Adaptive Algorithms for Blind Channel Identification , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[9]  Yannick Deville,et al.  ALTERNATIVE STRUCTURES AND POWER SPECTRUM CRITERIA FOR BLIND SEGMENTATION AND SEPARATION OF CONVOLUTIVE SPEECH MIXTURES , 2003 .

[10]  Jean-Marc Nuzillard,et al.  Application of blind source separation to 1-D and 2-D nuclear magnetic resonance spectroscopy , 1998, IEEE Signal Processing Letters.

[11]  Jacob Benesty,et al.  A blind channel identification-based two-stage approach to separation and dereverberation of speech signals in a reverberant environment , 2005, IEEE Transactions on Speech and Audio Processing.

[12]  Scott Rickard,et al.  Blind separation of speech mixtures via time-frequency masking , 2004, IEEE Transactions on Signal Processing.

[13]  Daniel W. C. Ho,et al.  Underdetermined blind source separation based on sparse representation , 2006, IEEE Transactions on Signal Processing.

[14]  Boualem Boashash,et al.  Separating More Sources Than Sensors Using Time-Frequency Distributions , 2005, EURASIP J. Adv. Signal Process..

[15]  Gabriel Rilling,et al.  On empirical mode decomposition and its algorithms , 2003 .

[16]  Andrzej Cichocki,et al.  Adaptive blind signal and image processing , 2002 .

[17]  Jesper Jensen,et al.  A comparison of sinusoidal model variants for speech and audio representation , 2002, 2002 11th European Signal Processing Conference.

[18]  C. Serviere,et al.  Separation of sinusoidal sources , 1997, Proceedings of the IEEE Signal Processing Workshop on Higher-Order Statistics.

[19]  Karim Abed-Meraim,et al.  Séparation aveugle sous-déterminée de sources audio par la méthode EMD (Empirical Mode Decomposition) , 2005 .

[20]  Sven Nordholm,et al.  Convolutive blind signal separation with post-processing , 2004, IEEE Transactions on Speech and Audio Processing.

[21]  T. Kailath,et al.  A least-squares approach to blind channel identification , 1995, IEEE Trans. Signal Process..

[22]  Andrzej Cichocki,et al.  Adaptive Blind Signal and Image Processing - Learning Algorithms and Applications , 2002 .

[23]  Jean-Francois Cardoso,et al.  Blind signal separation: statistical principles , 1998, Proc. IEEE.

[24]  Gabriel Rilling,et al.  Empirical mode decomposition as a filter bank , 2004, IEEE Signal Processing Letters.

[25]  N. Huang,et al.  The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis , 1998, Proceedings of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.

[26]  Cedric Nishan Canagarajah,et al.  Underdetermined noisy blind separation using dual matching pursuits , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[27]  Barak A. Pearlmutter,et al.  Survey of sparse and non‐sparse methods in source separation , 2005, Int. J. Imaging Syst. Technol..

[28]  Adel Belouchrani Jean-Fran,et al.  Maximum likelihood source separation for discrete sources , 2007 .

[29]  Pierre Comon,et al.  Blind identification and source separation in 2×3 under-determined mixtures , 2004, IEEE Trans. Signal Process..

[30]  Karim Abed-Meraim,et al.  Audio modeling based on delayed sinusoids , 2004, IEEE Transactions on Speech and Audio Processing.

[31]  Wai Lok Woo,et al.  Non-sparse approach to underdetermined blind signal estimation , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[33]  Roberto Todeschini,et al.  The data analysis handbook , 1994, Data handling in science and technology.

[34]  Ed F. Deprettere,et al.  Robust exponential modeling of audio signals , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[35]  A. Nandi Blind estimation using higher-order statistics , 1999 .

[36]  Mineichi Kudo,et al.  Performance analysis of minimum /spl lscr//sub 1/-norm solutions for underdetermined source separation , 2004, IEEE Transactions on Signal Processing.

[37]  Sabine Van Huffel,et al.  Fast algorithms for exponential data modeling , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[38]  K. Arun,et al.  State-space and singular-value decomposition-based approximation methods for the harmonic retrieval problem , 1983 .

[39]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.

[40]  B. De Moor,et al.  ICA techniques for more sources than sensors , 1999, Proceedings of the IEEE Signal Processing Workshop on Higher-Order Statistics. SPW-HOS '99.

[41]  Shubha Kadambe,et al.  A probabilistic approach for blind source separation of underdetermined convolutive mixtures , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[42]  Karim Abed-Meraim,et al.  Blind system identification using cross-relation methods: further results and developments , 2003, Seventh International Symposium on Signal Processing and Its Applications, 2003. Proceedings..

[43]  Yingbo Hua,et al.  Performance analysis of the subspace method for blind channel identification , 1996, Signal Process..

[44]  Yingbo Hua,et al.  Performance comparison of three methods for blind channel identification , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.