Variational Bayesian model averaging for audio source separation

Non-negative Matrix Factorization (NMF) has become popular in audio source separation in order to design source-specific models. The number of components of the NMF is known to have a noticeable influence on separation quality. Many methods have thus been proposed to select the best order for a given task. To go further, we propose here to use model averaging. As existing techniques do not allow an effective averaging, we introduce a generative model in which the number of components is a random variable and we propose a modification to conventional variational Bayesian (VB) inference. Experimental results on synthetic data show promising results as our model leads to better separation results and is less computationally demanding than conventional VB model selection.

[1]  Roland Badeau,et al.  Blind Signal Decompositions for Automatic Transcription of Polyphonic Music: NMF and K-SVD on the Benchmark , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[2]  Hagai Attias,et al.  A Variational Bayesian Framework for Graphical Models , 1999 .

[3]  Emmanuel Vincent,et al.  Variational Bayesian Inference for Source Separation and Robust Feature Extraction , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[4]  Jon Barker,et al.  The second ‘chime’ speech separation and recognition challenge: Datasets, tasks and baselines , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[5]  Simon J. Godsill,et al.  Bayesian extensions to non-negative matrix factorisation for audio signal modelling , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[6]  Adrian Corduneanu,et al.  Variational Bayesian Model Selection for Mixture Distributions , 2001 .

[7]  H. Vincent Poor,et al.  IEEE Workshop on Statistical Signal Processing, SSP 2014, Gold Coast, Australia, June 29 - July 2, 2014 , 2014, Symposium on Software Performance.

[8]  Matthew Brand,et al.  Structure Learning in Conditional Probability Models via an Entropic Prior and Parameter Extinction , 1999, Neural Computation.

[9]  Adrian E. Raftery,et al.  Bayesian model averaging: a tutorial (with comments by M. Clyde, David Draper and E. I. George, and a rejoinder by the authors , 1999 .

[10]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[11]  Perry R. Cook,et al.  Bayesian Nonparametric Matrix Factorization for Recorded Music , 2010, ICML.

[12]  Ali Taylan Cemgil,et al.  Bayesian Inference for Nonnegative Matrix Factorisation Models , 2009, Comput. Intell. Neurosci..

[13]  Matthew J. Beal,et al.  The variational Bayesian EM algorithm for incomplete data: with application to scoring graphical model structures , 2003 .

[14]  Emmanuel Vincent,et al.  Introducing a simple fusion framework for audio source separation , 2013, 2013 IEEE International Workshop on Machine Learning for Signal Processing (MLSP).

[15]  C. Févotte,et al.  Automatic Relevance Determination in Nonnegative Matrix Factorization , 2009 .

[16]  Rémi Gribonval,et al.  Performance measurement in blind audio source separation , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[17]  Adrian E. Raftery,et al.  Bayesian Model Averaging: A Tutorial , 2016 .

[18]  Emmanuel Vincent,et al.  A General Flexible Framework for the Handling of Prior Information in Audio Source Separation , 2012, IEEE Transactions on Audio, Speech, and Language Processing.