论文信息 - Singer melody extraction in polyphonic signals using source separation methods

Singer melody extraction in polyphonic signals using source separation methods

We propose a new approach for singer melody extraction, based on blind source separation techniques. The short time Fourier transform (STFT) of the singer signal is modelled by a Gaussian mixture model (GMM) explicitly coupled with a generative source/filter model. We then introduce a simplification of this general GMM and approximate the STFT of the music signal using Non-negative Matrix Factorization (NMF) techniques. The melody line is extracted from the explicit source component of the model thanks to a Viterbi algorithm. The results are very promising and comparable or better than those of state-of-the-art systems.

[1] Masataka Goto,et al. A robust predominant-F0 estimation method for real-time detection of melody and bass lines in CD recordings , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[2] Graham E. Poliner,et al. Melody Transcription From Music Audio: Approaches and Evaluation , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[3] Rémi Gribonval,et al. Adaptation of Bayesian Models for Single-Channel Source Separation and its Application to Voice/Music Separation in Popular Songs , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[4] Inderjit S. Dhillon,et al. Generalized Nonnegative Matrix Approximations with Bregman Divergences , 2005, NIPS.

[5] Rémi Gribonval,et al. Non negative sparse representation for Wiener based source separation with a single sensor , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[6] D. Klatt,et al. Analysis, synthesis, and perception of voice quality variations among female and male talkers. , 1990, The Journal of the Acoustical Society of America.

[7] Hideki Kawahara,et al. YIN, a fundamental frequency estimator for speech and music. , 2002, The Journal of the Acoustical Society of America.