A hybrid approach to singing pitch extraction based on trend estimation and hidden Markov models

In this paper, we propose a hybrid method for singing pitch extraction from polyphonic audio music. We have observed several kinds of pitch errors made by a previously proposed algorithm based on trend estimation. We also noticed that other pitch tracking methods tend to have other types of pitch error. Then it becomes intuitive to combine the results of several pitch trackers to achieve a better accuracy. In this paper, we adopt 3 methods as a committee to determine the pitch, including the trend-estimation-based method for forward and backward signals, and training-based HMM method. Experimental results demonstrate that the proposed approach outperforms the best algorithm for the task of audio melody extraction in MIREX 2010.

[1]  Jyh-Shing Roger Jang,et al.  Singing Pitch Extraction from Monaural Polyphonic Songs by Contextual Audio Modeling and Singing Harmonic Enhancement , 2009, ISMIR.

[2]  DeLiang Wang,et al.  Detecting pitch of singing voice in polyphonic audio , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[3]  Jyh-Shing Roger Jang,et al.  Singing Pitch Extraction by Voice Vibrato / Tremolo Estimation and Instrument Partial Deletion , 2010, ISMIR.

[4]  Karin Dressler,et al.  An Auditory Streaming Approach for Melody Extraction from Polyphonic Music , 2011, ISMIR.

[5]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[6]  DeLiang Wang,et al.  A trend estimation algorithm for singing pitch detection in musical recordings , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[7]  Karin Dressler An Auditory Streaming Approach on Melody Extraction , 2006 .

[8]  Shigeki Sagayama,et al.  Melody line estimation in homophonic music audio signals based on temporal-variability of melodic source , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[9]  Geoffroy Peeters,et al.  Singing voice detection in music tracks using direct voice vibrato detection , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[10]  Anssi Klapuri,et al.  Transcription of the Singing Melody in Polyphonic Music , 2006, ISMIR.

[11]  Jyh-Shing Roger Jang,et al.  On the Improvement of Singing Voice Separation for Monaural Recordings Using the MIR-1K Dataset , 2010, IEEE Transactions on Audio, Speech, and Language Processing.