Query by humming of midi and audio using locality sensitive hashing

This paper proposes a query by humming method based on locality sensitive hashing (LSH). The method constructs an index of melodic fragments by extracting pitch vectors from a database of melodies. In retrieval, the method automatically transcribes a sung query into notes and then extracts pitch vectors similarly to the index construction. For each query pitch vector, the method searches for similar melodic fragments in the database to obtain a list of candidate melodies. This is performed efficiently by using LSH. The candidate melodies are ranked by their distance to the entire query and returned to the user. In our experiments, the method achieved mean reciprocal rank of 0.885 for 2797 queries when searching from a database of 6030 MIDI melodies. To retrieve audio signals, we apply an automatic melody transcription method to construct the melody database directly from music recordings and report the corresponding retrieval results.

[1]  Jyh-Shing Roger Jang,et al.  Continuous HMM and Its Enhancement for Singing/Humming Query Retrieval , 2005, ISMIR.

[2]  Anssi Klapuri,et al.  Transcription of the Singing Melody in Polyphonic Music , 2006, ISMIR.

[3]  Charles L. Parker Applications of Binary Classification and Adaptive Boosting to the Query-By-Humming Problem , 2005, ISMIR.

[4]  Rainer Typke,et al.  Music Retrieval based on Melodic Similarity , 2007 .

[5]  M.P. Ryynanen,et al.  Polyphonic music transcription using note event modeling , 2005, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005..

[6]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[7]  Andreas Nürnberger,et al.  Towards Query by Singing/Humming on Audio Databases , 2007, ISMIR.

[8]  Kjell Lemström,et al.  String Matching Techniques for Music Retrieval , 2000 .

[9]  Shumeet Baluja,et al.  Known-Audio Detection using Waveprint: Spectrogram Fingerprinting by Wavelet Hashing , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[10]  Michael A. Casey,et al.  Fast Recognition of Remixed Music Audio , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[11]  Alexandr Andoni,et al.  Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[12]  Jyh-Shing Roger Jang,et al.  A Query-by-Singing System based on Dynamic Programming , 2000 .