论文信息 - An iterative approach to monaural musical mixture de-soloing

An iterative approach to monaural musical mixture de-soloing

In this article, we introduce a novel approach for monaural source separation with the specific aim to separate a polyphonic musical recording into two main sources: a main instrument (or melody) track and an accompaniment track. To that aim, we propose to model the power spectral densities (PSDs) of both contributions with a source/filter model for the main instrument while retaining a model emphasizing temporal repetitions of the musical background. We show that improved source separation performances can be obtained by a two-step estimation strategy where the model parameters are re-estimated in a second stage by adequately exploiting the main melody line estimated in a first stage. The experiments conducted on several monaural signal databases show that our system achieves state-of-the-art performances compared to other unsupervised source separation algorithms.

[1] DeLiang Wang,et al. Separation of Singing Voice From Music Accompaniment for Monaural Recordings , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[2] Gaël Richard,et al. Singer melody extraction in polyphonic signals using source separation methods , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[3] Rémi Gribonval,et al. Audio source separation with a single sensor , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[4] Rémi Gribonval,et al. Adaptation of Bayesian Models for Single-Channel Source Separation and its Application to Voice/Music Separation in Popular Songs , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[5] Emmanuel Vincent,et al. Harmonic and inharmonic Nonnegative Matrix Factorization for Polyphonic Pitch transcription , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[6] Mathieu Lagrange,et al. Normalized Cuts for Predominant Melodic Source Separation , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[7] Emmanuel Vincent,et al. First Stereo Audio Source Separation Evaluation Campaign: Data, Algorithms and Results , 2007, ICA.

[8] D. Klatt,et al. Analysis, synthesis, and perception of voice quality variations among female and male talkers. , 1990, The Journal of the Acoustical Society of America.

[9] Gaël Richard,et al. Transcription and Separation of Drum Signals From Polyphonic Music , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[10] Nancy Bertin,et al. Nonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis , 2009, Neural Computation.

[11] Anssi Klapuri,et al. Accompaniment separation and karaoke application based on automatic melody transcription , 2008, 2008 IEEE International Conference on Multimedia and Expo.