CubyHum: a fully operational "query by humming" system

Query by humming' is an interaction concept in which the identity of a song has to be revealed fast and orderly from a given sung input using a large database of known melodies. In short, it tries to detect the pitches in a sung melody and compares these pitches with symbolic representations of the known melodies. Melodies that are similar to the sung pitches are retrieved. Approximate pattern matching in the melody comparison process compensates for the errors in the sung melody by using classical dynamic programming. A filtering method is used to save computation in the dynamic programming framework. This paper presents the algorithms for pitch detection, note onset detection, quantization, melody encoding and approximate pattern matching as they have been implemented in the CubyHum software system. Since human reproduction of melodies is imperfect, findings from an experimental singing study were a crucial input to the development of the algorithms. Future research should pay special attention to the reliable detection of note onsets in any preferred singing style. In addition, research on index methods and fast bitparallelism algorithms for approximate pattern matching need to be further pursued to decrease computational requirements when dealing with large melody databases.

[1]  Eugene W. Myers,et al.  A fast bit-vector algorithm for approximate string matching based on dynamic programming , 1998, JACM.

[2]  David Sankoff,et al.  Comparison of musical sequences , 1990, Comput. Humanit..

[3]  Brian Christopher Smith,et al.  Query by humming: musical information retrieval in an audio database , 1995, MULTIMEDIA '95.

[4]  Peter H. Sellers,et al.  The Theory and Computation of Evolutionary Distances: Pattern Recognition , 1980, J. Algorithms.

[5]  Edward M. McCreight,et al.  A Space-Economical Suffix Tree Construction Algorithm , 1976, JACM.

[6]  Paul Masri,et al.  Imroved Modelling of Attack Transients in Music Analysis-Resynthesis , 1996, ICMC.

[7]  Ian H. Witten,et al.  Tune Retrieval in the Multimedia Library , 2000, Multimedia Tools and Applications.

[8]  W. Dowling Scale and contour: Two components of a theory of memory for melodies. , 1978 .

[9]  Zehra F. Peynírcíğlu,et al.  Name or hum that tune: Feeling of knowing for music , 1998, Memory & cognition.

[10]  C. Krumhansl Music as Cognition. , 1987 .

[11]  Michael J. Fischer,et al.  The String-to-String Correction Problem , 1974, JACM.

[12]  Gonzalo Navarro,et al.  Faster Approximate String Matching , 1999, Algorithmica.

[13]  D. Levitin Absolute memory for musical pitch: Evidence from the production of learned melodies , 1994, Perception & psychophysics.

[14]  W. Andrew Schloss,et al.  On the automatic transcription of percussive music , 1985 .

[15]  Esko Ukkonen,et al.  Approximate String Matching with q-grams and Maximal Matches , 1992, Theor. Comput. Sci..

[16]  D. J. Hermes,et al.  Measurement of pitch by subharmonic summation. , 1988, The Journal of the Acoustical Society of America.

[17]  Esko Ukkonen,et al.  Constructing Suffix Trees On-Line in Linear Time , 1992, IFIP Congress.

[18]  J C Brown,et al.  Pitch center of stringed instrument vibrato tones. , 1996, The Journal of the Acoustical Society of America.

[19]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.