A statistical approach to retrieval under user-dependent uncertainty in query-by-humming systems

Robustly addressing uncertainty in query formulation and search is one of the most challenging problems in multimedia information retrieval (MIR) systems. In this paper, a statistical approach to the problem of retrieval under the effect of uncertainty in Query by Humming (QBH) systems is presented. Direct transcription of audio to pitch and duration symbols is performed. From the transcribed data vector, finger prints that carry a fixed length of information from characteristic local points of the hummed melody are extracted. Instead of employing the humming input as a whole, extracted characteristic information packages are used for search through the database. The distance for each finger print to the original melodies in the database is calculated and converted to probabilistic similarity measures. Melodies with the highest similarity measures are returned to the user as the most likely query result. This algorithm is tested with manually annotated data comprising 250 humming samples in conjunction with a database of 200 pre-processed midi files. Retrieval accuracy of 94 percent is demonstrated for the samples of subjects that have some musical training/background compared to 72 percent accuracy achieved for the samples of non-trained subjects. Results also show that extracting finger prints with respect to characteristic local points of the hummed tune is an effective and robust way for search and retrieval under the effect of uncertainty

[1]  Ian H. Witten,et al.  Tune Retrieval in the Multimedia Library , 2000, Multimedia Tools and Applications.

[2]  Brian Christopher Smith,et al.  Query by humming: musical information retrieval in an audio database , 1995, MULTIMEDIA '95.

[3]  C. Chuan Tone and Voice: A Derivation of the Rules of Voice-Leading from Perceptual Principles , 2001 .

[4]  David De Roure,et al.  A tool for content based navigation of music , 1998, MULTIMEDIA '98.

[5]  Peter Desain,et al.  The Formation of Rhythmic Categories and Metric Priming , 2003, Perception.

[6]  Emanuele Pollastri An Audio Front End for Query-by-Humming Systems , 2001, ISMIR.

[7]  Dennis Shasha,et al.  Warping indexes with envelope transforms for query by humming , 2003, SIGMOD '03.

[8]  A Lewis,et al.  THE SCIENCE OF SOUND , 1997 .

[9]  C.-C. Jay Kuo,et al.  Multidimensional humming transcription using a statistical approach for query by humming systems , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[10]  C.-C. Jay Kuo,et al.  An HMM-based approach to humming transcription , 2002, Proceedings. IEEE International Conference on Multimedia and Expo.

[11]  Elaine Chew,et al.  Creating data resources for designing user-centric frontends for query by humming systems , 2003, MIR '03.

[12]  Jean-Gabriel Ganascia,et al.  Musical content-based retrieval: an overview of the Melodiscov approach and system , 1999, MULTIMEDIA '99.

[13]  Ian H. Witten,et al.  Towards the digital music library: tune retrieval from acoustic input , 1996, DL '96.

[14]  Lie Lu,et al.  A new approach to query by humming in music retrieval , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..

[15]  Jeanne Bamberger TURNING MUSIC THEORY ON ITS EAR , 2004 .