An iterative approach to monaural musical mixture de-soloing

In this article, we introduce a novel approach for monaural source separation with the specific aim to separate a polyphonic musical recording into two main sources: a main instrument (or melody) track and an accompaniment track. To that aim, we propose to model the power spectral densities (PSDs) of both contributions with a source/filter model for the main instrument while retaining a model emphasizing temporal repetitions of the musical background. We show that improved source separation performances can be obtained by a two-step estimation strategy where the model parameters are re-estimated in a second stage by adequately exploiting the main melody line estimated in a first stage. The experiments conducted on several monaural signal databases show that our system achieves state-of-the-art performances compared to other unsupervised source separation algorithms.

[1]  DeLiang Wang,et al.  Separation of Singing Voice From Music Accompaniment for Monaural Recordings , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[2]  Gaël Richard,et al.  Singer melody extraction in polyphonic signals using source separation methods , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[3]  Rémi Gribonval,et al.  Audio source separation with a single sensor , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[4]  Rémi Gribonval,et al.  Adaptation of Bayesian Models for Single-Channel Source Separation and its Application to Voice/Music Separation in Popular Songs , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[5]  Emmanuel Vincent,et al.  Harmonic and inharmonic Nonnegative Matrix Factorization for Polyphonic Pitch transcription , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[6]  Mathieu Lagrange,et al.  Normalized Cuts for Predominant Melodic Source Separation , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  Emmanuel Vincent,et al.  First Stereo Audio Source Separation Evaluation Campaign: Data, Algorithms and Results , 2007, ICA.

[8]  D. Klatt,et al.  Analysis, synthesis, and perception of voice quality variations among female and male talkers. , 1990, The Journal of the Acoustical Society of America.

[9]  Gaël Richard,et al.  Transcription and Separation of Drum Signals From Polyphonic Music , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[10]  Nancy Bertin,et al.  Nonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis , 2009, Neural Computation.

[11]  Anssi Klapuri,et al.  Accompaniment separation and karaoke application based on automatic melody transcription , 2008, 2008 IEEE International Conference on Multimedia and Expo.