A practical query-by-humming system for a large music database

A music retrieval system that accepts hummed tunes as queries is described in this paper. This system uses similarity retrieval because a hummed tune may contain errors. The retrieval result is a list of song names ranked according to the closeness of the match. Our ultimate goal is that the correct song should be first on the list. This means that eventually our system's similarity retrieval should allow for only one correct answer. The most significant improvement our system has over general query-by-humming systems is that all processing of musical information is done based on beats instead of notes. This type of query processing is robust against queries generated from erroneous input. In addition, acoustic information is transcribed and converted into relative intervals and is used for making feature vectors. This increases the resolution of the retrieval system compared with other general systems, which use only pitch direction information. The database currently holds over 10,000 songs, and the retrieval time is at most one second. This level of performance is mainly achieved through the use of indices for retrieval. In this paper, we also report on the results of music analyses of the songs in the database. Based on these results, new technologies for improving retrieval accuracy, such as partial feature vectors and or'ed retrieval among multiple search keys, are proposed. The effectiveness of these technologies is evaluated quantitatively, and it is found that the retrieval accuracy increases by more than 20% compared with the previous system [9]. Practical user interfaces for the system are also described.

[1]  Brian Christopher Smith,et al.  Query by humming: musical information retrieval in an audio database , 1995, MULTIMEDIA '95.

[2]  Jonathan Foote,et al.  Content-based retrieval of music and audio , 1997, Other Conferences.

[3]  Ian H. Witten,et al.  The New Zealand Digital Library MELody inDEX , 1997, D Lib Mag..

[4]  David De Roure,et al.  A tool for content based navigation of music , 1998, MULTIMEDIA '98.

[5]  B. S. Manjunath,et al.  NeTra: A toolbox for navigating large image databases , 1997, Proceedings of International Conference on Image Processing.

[6]  Atsuo Yoshitaka,et al.  A Survey on Content-Based Retrieval for Multimedia Databases , 1999, IEEE Trans. Knowl. Data Eng..

[7]  Jean-Gabriel Ganascia,et al.  Musical content-based retrieval: an overview of the Melodiscov approach and system , 1999, MULTIMEDIA '99.

[8]  Masashi Yamamuro,et al.  Humming Query System Using Normalized Time Scale , 1999, CODAS.

[9]  Jonathan Foote,et al.  An overview of audio information retrieval , 1999, Multimedia Systems.

[10]  Jonathan Foote,et al.  Visualizing music and audio using self-similarity , 1999, MULTIMEDIA '99.

[11]  Udi Manber,et al.  Fast text searching: allowing errors , 1992, CACM.

[12]  Justin Zobel,et al.  Melodic matching techniques for large music databases , 1999, MULTIMEDIA '99.

[13]  Ronald W. Schafer,et al.  Real-time digital hardware pitch detector , 1976 .

[14]  Naoko Kosugi,et al.  Music retrieval by humming-using similarity retrieval over high dimensional feature vector space , 1999, 1999 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM 1999). Conference Proceedings (Cat. No.99CH36368).

[15]  Masashi Yamamuro,et al.  Multiple inverted array structure for similar image retrieval , 1998, Proceedings. IEEE International Conference on Multimedia Computing and Systems (Cat. No.98TB100241).

[16]  M. Yamamuro,et al.  A comprehensive image similarity retrieval system that utilizes multiple feature vectors in high dimensional space , 1997, Proceedings of ICICS, 1997 International Conference on Information, Communications and Signal Processing. Theme: Trends in Information Systems Engineering and Wireless Multimedia Communications (Cat..

[17]  Masashi Yamamuro,et al.  Let's search for songs by humming! , 1999, MULTIMEDIA '99.

[18]  John R. Smith,et al.  Image Classification and Querying Using Composite Region Templates , 1999, Comput. Vis. Image Underst..