Advanced query by humming system using diffused hidden Markov model and tempo based dynamic programming

Query by humming (QBH) is a content-based system to identify which song a person sang. In this paper, we proposed a note-based QBH system which apply the hidden Markov model and dynamic programming to find the most possible song. Also, we proposed several techniques to improve the QBH system performance. First, we propose a modified method for onset detection. The frequency information is also used in this part By time-frequency analysis, we can find out the onset points which are difficult to be picked up in the time domain. Besides the pitch feature, the beat information and possible pitch and humming errors are also considered for melody matching. The tempo feature is also an important part for a song. Even though the pitch sequences of two songs are the same, if the tempo is clearly different, then they are complete different songs. Also the possible singing errors are considered. Simulations show that the performance can be much improved by our proposed methods.

[1]  M. Schroeder Period histogram and product spectrum: new methods for fundamental-frequency measurement. , 1968, The Journal of the Acoustical Society of America.

[2]  Mark B. Sandler,et al.  A tutorial on onset detection in music signals , 2005, IEEE Transactions on Speech and Audio Processing.

[3]  William P. Birmingham,et al.  Effectiveness of HMM-based retrieval on large databases , 2003, ISMIR.

[4]  P. B. Coaker,et al.  Applied Dynamic Programming , 1964 .

[5]  Jian-Jiun Ding,et al.  Improved onset detection algorithm based on fractional power envelope match filter , 2011, 2011 19th European Signal Processing Conference.

[6]  Weiqiang Zhang,et al.  A fast query by humming system based on notes , 2010, INTERSPEECH.

[7]  Mark B. Sandler,et al.  Techniques for Automatic Music Transcription , 2000, ISMIR.

[8]  Kang Ryoung Park,et al.  Fast Query-by-Singing/Humming System That Combines Linear Scaling and Quantized Dynamic Time Warping Algorithm , 2015, Int. J. Distributed Sens. Networks.

[9]  Brian Christopher Smith,et al.  Query by humming: musical information retrieval in an audio database , 1995, MULTIMEDIA '95.

[10]  Kang Ryoung Park,et al.  Intelligent query by humming system based on score level fusion of multiple classifiers , 2011, EURASIP J. Adv. Signal Process..

[11]  Steffen Pauws,et al.  CubyHum: a fully operational "query by humming" system , 2002, ISMIR.

[12]  Jeng-Shyang Pan,et al.  Efficient algorithms for speech pitch estimation , 2001, Proceedings of 2001 International Symposium on Intelligent Multimedia, Video and Speech Processing. ISIMP 2001 (IEEE Cat. No.01EX489).

[13]  Wei-Ho Tsai,et al.  An Efficient Query-by-Singing/Humming System Based on Fast Fourier Transforms of Note Sequences , 2012, 2012 IEEE International Conference on Multimedia and Expo.

[14]  Malcolm D. Macleod,et al.  Onset Detection in Musical Audio Signals , 2003, ICMC.

[15]  William P. Birmingham,et al.  Name that tune: A pilot study in finding a melody from a sung query , 2004, J. Assoc. Inf. Sci. Technol..

[16]  Ming-Yang Kao,et al.  Content-based music retrieval using linear scaling and branch-and-bound tree search , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..

[17]  S. Qian,et al.  Joint time-frequency analysis : methods and applications , 1996 .

[18]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[19]  Mohan S. Kankanhalli,et al.  Music scale modeling for melody matching , 2003, MULTIMEDIA '03.