An interface for melody input

We present a software system, called Tunserver, which recognizes a musical tune whistled by the user, finds it in a database, and returns its name, composer, and other information. Such a service is useful for track retrieval at radio stations, music stores, etc., and is also a step toward the long-term goal of communicating with a computer much like one would with a human being. Tuneserver is implemented as a public Java-based WWW service with a database of approximately 10,000 motifs. Tune recognition is based on a highly error-resistant encoding, proposed by Parsons, that uses only the direction of the melody, ignoring the size of intervals as well as rhythm. We present the design and implementation of the tune recognition core, outline the design of the Web service, and describe the results obtained in an empirical evaluation of the new interface, including the derivation of suitable system parameters, resulting performance figures, and an error analysis.

[1]  H WittenIan,et al.  A public library based on full-text retrieval , 1998 .

[2]  Esko Ukkonen,et al.  Finding Approximate Patterns in Strings , 1985, J. Algorithms.

[3]  Frank Dellaert,et al.  Recognizing emotion in speech , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[4]  Alexander H. Waibel,et al.  NPen/sup ++/: a writer independent, large vocabulary on-line cursive handwriting recognition system , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[5]  Marsha Berman,et al.  The directory of tunes and musical themes , 1975 .

[6]  David G. Stork,et al.  Invited Speech: Speechreading: An Overview of Image Processing, Feature Extraction, Sensory Intergration and Pattern Recognition Techiques , 1996 .

[7]  Wolfgang Minker,et al.  Multimodal speech systems , 1999 .

[8]  Brian Christopher Smith,et al.  Query by humming: musical information retrieval in an audio database , 1995, MULTIMEDIA '95.

[9]  Ian H. Witten,et al.  A public library based on full-text retrieval , 1998, CACM.

[10]  Jan O. Borchers WorldBeat: designing a baton-based interface for an interactive music exhibit , 1997, CHI.

[11]  David G. Stork,et al.  Speechreading: an overview of image processing, feature extraction, sensory integration and pattern recognition techniques , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[12]  Ian H. Witten,et al.  The New Zealand Digital Library MELody inDEX , 1997, D Lib Mag..

[13]  Raj Reddy Grand challenges in AI , 1995, CSUR.

[14]  R. Jacob Human-computer interaction: input devices , 1996, CSUR.

[15]  James D. Hollan,et al.  Strategic directions in human-computer interaction , 1996, CSUR.