Towards intelligent string matching in query-by-humming systems

In the past 5-10 years, there have been several attempts at the creation of a musical database that can be queried acoustically. The main problem hindering such efforts so far has been scalability, with limits of the frequently used three-character representation for melodic contour becoming apparent even at relatively small database sizes. Extending the contour representation to 24 semi-tone resolutions would go a long way towards solving such problems, but applying current string matching methods to a 24 character alphabet would prove difficult at best, and human errors would be far more prevalent. Here we show that the vast majority of errors at this resolution, although more numerous, are quite predictable using current methods of probabilistic modeling. This is evidence which we hope opens the door to more intelligent forms of note matching.

[1]  Adam Taro Lindsay,et al.  Using contour as a mid-level representation of melody , 1996 .

[2]  C.-C. Jay Kuo,et al.  An HMM-based approach to humming transcription , 2002, Proceedings. IEEE International Conference on Multimedia and Expo.

[3]  Jyri Huopaniemi,et al.  Melodic Resolution in Music Retrieval , 2001 .

[4]  Lie Lu,et al.  A new approach to query by humming in music retrieval , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..

[5]  Masashi Yamamuro,et al.  A practical query-by-humming system for a large music database , 2000, ACM Multimedia.

[6]  Daniel J. Levitin,et al.  Memory for musical attributes , 1999 .

[7]  J. Stephen Downie,et al.  Evaluation of a simple and effective music information retrieval method , 2000, SIGIR '00.

[8]  W. Dowling Scale and contour: Two components of a theory of memory for melodies. , 1978 .

[9]  Youngmoo E. Kim,et al.  Analysis of a Contour-based Representation for Melody , 2000, ISMIR.

[10]  J. Zobel,et al.  Matching Techniques for Large Music Databases , 1999 .

[11]  Justin Zobel,et al.  Melodic matching techniques for large music databases , 1999, MULTIMEDIA '99.

[12]  B Gold,et al.  Parallel processing techniques for estimating pitch periods of speech in the time domain. , 1969, The Journal of the Acoustical Society of America.

[13]  Preeti Rao,et al.  BUILDING A MELODY RETRIEVAL SYSTEM , 2002 .

[14]  Roger B. Dannenberg,et al.  Melody Matching Directly From Audio , 2001 .

[15]  Brian Christopher Smith,et al.  Query by humming: musical information retrieval in an audio database , 1995, MULTIMEDIA '95.

[16]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[17]  Shih-Fu Chang,et al.  Overview of the MPEG-7 standard , 2001, IEEE Trans. Circuits Syst. Video Technol..

[18]  Andrew J. Viterbi,et al.  Error bounds for convolutional codes and an asymptotically optimum decoding algorithm , 1967, IEEE Trans. Inf. Theory.