Blind Spectral-GMM Estimation for Underdetermined Instantaneous Audio Source Separation

The underdetermined blind audio source separation problem is often addressed in the time-frequency domain by assuming that each time-frequency point is an independently distributed random variable. Other approaches which are not blind assume a more structured model, like the Spectral Gaussian Mixture Models (Spectral-GMMs), thus exploiting statistical diversity of audio sources in the separation process. However, in this last approach, Spectral-GMMs are supposed to be learned from some training signals. In this paper, we propose a new approach for learning Spectral-GMMs of the sources without the need of using training signals. The proposed blind method significantly outperforms state-of-the-art approaches on stereophonic instantaneous music mixtures.

[1]  N. Mitianoudis,et al.  Simple mixture model for sparse overcomplete ICA , 2004 .

[2]  Aapo Hyvärinen,et al.  Learning Natural Image Structure with a Horizontal Product Model , 2009, ICA.

[3]  Simon J. Godsill,et al.  A Bayesian Approach for Blind Separation of Sparse Sources , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[4]  Rémi Gribonval,et al.  A Robust Method to Count and Locate Audio Sources in a Stereophonic Linear Anechoic Mixture , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[5]  Richard M. Everson,et al.  Independent Component Analysis: Principles and Practice , 2001 .

[6]  Matti Karjalainen,et al.  Localization of Amplitude-Panned Virtual Sources I: Stereophonic Panning , 2001 .

[7]  Rémi Gribonval,et al.  Adaptation of Bayesian Models for Single-Channel Source Separation and its Application to Voice/Music Separation in Popular Songs , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[8]  Scott Rickard,et al.  Blind separation of speech mixtures via time-frequency masking , 2004, IEEE Transactions on Signal Processing.

[9]  Rémi Gribonval,et al.  Underdetermined Instantaneous Audio Source Separation via Local Gaussian Modeling , 2009, ICA.

[10]  Rémi Gribonval,et al.  Performance measurement in blind audio source separation , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[11]  Laurent Benaroya,et al.  WIENER BASED SOURCE SEPARATION WITH HMM/GMM USING A SINGLE SENSOR , 2003 .

[12]  Barak A. Pearlmutter,et al.  Blind Source Separation by Sparse Decomposition in a Signal Dictionary , 2001, Neural Computation.

[13]  Emmanuel Vincent,et al.  Complex Nonconvex l p Norm Minimization for Underdetermined Source Separation , 2007, ICA.

[14]  Tom E. Bishop,et al.  Blind Image Restoration Using a Block-Stationary Signal Model , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[15]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[16]  Barak A. Pearlmutter,et al.  Independent Component Analysis: Blind source separation by sparse decomposition in a signal dictionary , 2001 .

[17]  Allan Kardec Barros,et al.  Independent Component Analysis and Blind Source Separation , 2007, Signal Processing.

[18]  Rémi Gribonval,et al.  A Robust Method to Count and Locate Audio Sources in a Stereophonic Linear Instantaneous Mixture , 2006, ICA.

[19]  Eric Moulines,et al.  Maximum likelihood for blind separation and deconvolution of noisy signals using mixture models , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[20]  Hagai Attias,et al.  Independent Factor Analysis , 1999, Neural Computation.

[21]  Ming Xiao,et al.  A statistically sparse decomposition principle for underdetermined blind source separation , 2005, 2005 International Symposium on Intelligent Signal Processing and Communication Systems.

[22]  Dinh-Tuan Pham,et al.  Blind separation of instantaneous mixtures of nonstationary sources , 2001, IEEE Trans. Signal Process..