Bayesian blind separation of audio mixtures with structured priors

In this paper we describe a Bayesian approach for separation of linear instantaneous mixtures of audio sources. Our method exploits the sparsity of the source expansion coefficients on a time-frequency basis, chosen here to be a MDCT. Conditionally upon an indicator variable which is 0 or 1, one source coefficient is either set to zero or given a Student t prior. Structured priors can be considered for the indicator variables, such as horizontal structures in the time-frequency plane, in order to model temporal persistency. A Gibbs sampler (a standard Markov chain Monte Carlo technique) is used to sample from the posterior distribution of the indicator variables, the source coefficients (corresponding to nonzero indicator variables), the hyperparameters of the Student t priors, the mixing matrix and the variance of the noise. We give results for separation of a musical stereo mixture of 3 sources.

[1]  Emmanuel Vincent,et al.  Musical source separation using time-frequency source priors , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[2]  Jun S. Liu,et al.  Covariance structure of the Gibbs sampler with applications to the comparisons of estimators and augmentation schemes , 1994 .

[3]  Simon J. Godsill,et al.  Sparse Regression with Structured Priors: Application to Audio Denoising , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[4]  Simon J. Godsill,et al.  Sparse linear regression in unions of bases via Bayesian variable selection , 2006, IEEE Signal Processing Letters.

[5]  J. Geweke,et al.  Variable selection and model comparison in regression , 1994 .

[6]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[7]  S. Godsill,et al.  Bayesian variable selection and regularization for time–frequency surface estimation , 2004 .

[8]  Simon J. Godsill,et al.  A BAYESIAN APPROACH TO TIME-FREQUENCY BASED BLIND SOURCE SEPARATION , 2005 .

[9]  Özgür Yilmaz,et al.  Blind separation of disjoint orthogonal signals: demixing N sources from 2 mixtures , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[10]  Lawrence R. Rabiner,et al.  A tutorial on Hidden Markov Models , 1986 .

[11]  Justinian Rosca,et al.  SPARSE SOURCE SEPARATION USING DISCRETE PRIOR MODELS , 2005 .

[12]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Barak A. Pearlmutter,et al.  Blind source separation by sparse decomposition , 2000, SPIE Defense + Commercial Sensing.

[14]  Simon J. Godsill,et al.  A Bayesian Approach for Blind Separation of Sparse Sources , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[15]  Rémi Gribonval,et al.  Performance measurement in blind audio source separation , 2006, IEEE Transactions on Audio, Speech, and Language Processing.