Continuous HMM and Its Enhancement for Singing/Humming Query Retrieval

The use of HMM (Hidden Markov Models) for speech recognition has been successful for various applications in the past decades. However, the use of continuous HMM (CHMM) for melody recognition via acoustic input (MRAI for short), or the so-called query by singing/humming, has seldom been reported, partly due to the difference in acoustic characteristics between speech and singing/humming inputs. This paper will derive the formula of CHMM training for frame-based MRAI. In particular, we shall propose enhancement to CHMM and demonstrate that with the enhancement scheme, CHMM can compare favourably with DTW in both efficiency and effectiveness.

[1]  Ning Hu,et al.  The MUSART Testbed for Query-by-Humming Evaluation , 2004, Computer Music Journal.

[2]  C.-C. Jay Kuo,et al.  An HMM-based approach to humming transcription , 2002, Proceedings. IEEE International Conference on Multimedia and Expo.

[3]  William P. Birmingham,et al.  HMM-based musical query retrieval , 2002, JCDL '02.

[4]  Jyh-Shing Roger Jang,et al.  Hierarchical filtering method for content-based music retrieval via acoustic input , 2001, MULTIMEDIA '01.

[5]  William P. Birmingham,et al.  Johnny Can't Sing: A Comprehensive Error Model for Sung Music Queries , 2002, ISMIR.

[6]  Yang Li,et al.  Linear hidden Markov model for music information retrieval based on humming , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[7]  Dennis Shasha,et al.  Warping indexes with envelope transforms for query by humming , 2003, SIGMOD '03.

[8]  Ning Hu,et al.  A comparison of melodic database retrieval techniques using sung queries , 2002, JCDL '02.

[9]  M. Kao,et al.  MIRACLE : A Music Information Retrieval System with Clustered Computing Engines , 2001 .