Melody Extraction from Polyphonic Audio Based on Particle Filter

This paper considers a particle filter based algorithm to extract melody from a polyphonic audio in the short-time Fourier transforms (STFT) domain. The extraction is focused on overcoming the difficulties due to harmonic / percussivesoundinterferences, possibilityofoctavemismatch, and dynamic variation in melody. The main idea of the algorithmistoconsiderprobabilisticrelationsbetweenmelody and polyphonic audio. Melody is assumed to follow a Markov process, and the framed segments of polyphonic audio are assumed to be conditionally independent given the parameters that represent the melody. The melody parameters are estimated using sequential importance sampling (SIS) which is a conventional particle filter method. In this paper, the likelihood and state transition are defined to overcome the aforementioned difficulties. The SIS algorithm relies on sequential importance density, and this density is designed using multiple pitches which are estimated by a simple multi-pitch extraction algorithm. Experimental results show that the considered algorithm outperforms other famous melody extraction algorithms in terms of the raw pitch accuracy (RPA) and the raw chroma accuracy (RCA).

[1]  Anssi Klapuri,et al.  NOTE EVENT MODELING FOR AUDIO MELODY EXTRACTION , 2005 .

[2]  P. Desain,et al.  VIBRATO : QUESTIONS AND ANSWERS FROM MUSICIANS AND SCIENCE , 2000 .

[3]  Amílcar Cardoso,et al.  Melody Detection in Polyphonic Musical Signals: Exploiting Perceptual Rules, Note Salience, and Melodic Smoothness , 2006, Computer Music Journal.

[4]  Graham E. Poliner,et al.  Melody Transcription From Music Audio: Approaches and Evaluation , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[5]  Neil J. Gordon,et al.  A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking , 2002, IEEE Trans. Signal Process..

[6]  M.P. Ryynanen,et al.  Polyphonic music transcription using note event modeling , 2005, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005..

[7]  Nando de Freitas,et al.  Sequential Monte Carlo Methods in Practice , 2001, Statistics for Engineering and Information Science.

[8]  Chang D. Yoo,et al.  MELODY EXTRACTION FROM POLYPHONIC AUDIO SIGNAL MIREX 2009 , 2009 .

[9]  Emmanuel Vincent,et al.  The 2005 Music Information retrieval Evaluation Exchange (MIREX 2005): Preliminary Overview , 2005, ISMIR.

[10]  Ephraim Speech enhancement using a minimum mean square error short-time spectral amplitude estimator , 1984 .

[11]  Daniel P. W. Ellis,et al.  Classification-based melody transcription , 2006, Machine Learning.

[12]  M. Marolt ON FINDING MELODIC LINES IN AUDIO RECORDINGS , 2004 .

[13]  Masataka Goto,et al.  A real-time music-scene-description system: predominant-F0 estimation for detecting melody and bass lines in real-world audio signals , 2004, Speech Commun..