Adaptation of source-specific dictionaries in Non-Negative Matrix Factorization for source separation

This paper concerns the adaptation of spectrum dictionaries in audio source separation with supervised learning. Supposing that samples of the audio sources to separate are available, a filter adaptation in the frequency domain is proposed in the context of Non-Negative Matrix Factorization with the Itakura-Saito divergence. The algorithm is able to retrieve the acoustical filter applied to the sources with a good accuracy, and demonstrates significantly higher performances on separation tasks when compared with the non-adaptive model.

[1]  Masataka Goto,et al.  RWC Music Database: Music genre database and musical instrument sound database , 2003, ISMIR.

[2]  Mikkel N. Schmidt,et al.  Single-channel speech separation using sparse non-negative matrix factorization , 2006, INTERSPEECH.

[3]  Rémi Gribonval,et al.  Audio source separation with a single sensor , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[4]  P. Philippe,et al.  One microphone singing voice separation using source-adapted models , 2005, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005..

[5]  Robert M. Gray,et al.  An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..

[6]  Rémi Gribonval,et al.  Performance measurement in blind audio source separation , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  Alexey Ozerov,et al.  Multichannel Nonnegative Matrix Factorization in Convolutive Mixtures for Audio Source Separation , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[8]  Mark D. Plumbley,et al.  Probabilistic Modeling Paradigms for Audio Source Separation , 2010 .

[9]  Sam T. Roweis,et al.  One Microphone Source Separation , 2000, NIPS.

[10]  Nancy Bertin,et al.  Nonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis , 2009, Neural Computation.

[11]  Inderjit S. Dhillon,et al.  Generalized Nonnegative Matrix Approximations with Bregman Divergences , 2005, NIPS.

[12]  Bhiksha Raj,et al.  Supervised and Semi-supervised Separation of Sounds from Single-Channel Mixtures , 2007, ICA.