Drum Loops Retrieval from Spoken Queries

Recent efforts in audio indexing and music information retrieval mostly focus on melody. If this is appropriate for polyphonic music signals, specific approaches are needed for systems dealing with percussive audio signals such as those produced by drums, tabla or djembé. In this article, we present a complete system allowing the management of a drum patterns (or drumloops) database. Queries in this database are formulated with spoken onomatopoeias—short meaningless words imitating the different sounds of the drumkit. The transcription task necessary to index the database is performed using Hidden Markov Models (HMM) and Support Vector Machines (SVM) and achieves a 86.4% correct recognition rate. The syllables of spoken queries are recognized and a relevant statistical model allows the comparison and alignment of the query with the rythmic sequences stored in the database, in order to provide a set of the most relevant drum loops.

[1]  ByrdDonald,et al.  Problems of music information retrieval in the real world , 2002 .

[2]  Miguel A. Alonso,et al.  A STUDY OF TEMPO TRACKING ALGORITHMS FROM POLYPHONIC MUSIC SIGNALS , 2003 .

[3]  Arbee L. P. Chen,et al.  Query by rhythm: an approach for song retrieval in music databases , 1998, Proceedings Eighth International Workshop on Research Issues in Data Engineering. Continuous-Media Databases and Applications.

[4]  Andrew J. Viterbi,et al.  Error bounds for convolutional codes and an asymptotically optimum decoding algorithm , 1967, IEEE Trans. Inf. Theory.

[5]  Shlomo Dubnov,et al.  Robust temporal and spectral modeling for query By melody , 2002, SIGIR '02.

[6]  Ian H. Witten,et al.  The New Zealand Digital Library MELody inDEX , 1997, D Lib Mag..

[7]  Jean Laroche,et al.  Efficient Tempo and Beat Tracking in Audio Recordings , 2003 .

[8]  Gaël Richard,et al.  Automatic Labelling of Tabla Signals , 2003 .

[9]  Anssi Klapuri,et al.  Sound onset detection by applying psychoacoustic knowledge , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[10]  Masataka Goto,et al.  An Audio-based Real-time Beat Tracking System for Music With or Without Drum-sounds , 2001 .

[11]  François Pachet,et al.  Automatic extraction of drum tracks from polyphonic music signals , 2002, Second International Conference on Web Delivering of Music, 2002. WEDELMUSIC 2002. Proceedings..

[12]  Miguel A. Alonso,et al.  Tempo And Beat Estimation Of Musical Signals , 2004, ISMIR.

[13]  Christopher Raphael,et al.  A hybrid graphical model for rhythmic parsing , 2002, Artif. Intell..

[14]  Aniruddh D. Patel,et al.  Acoustic and Perceptual Comparison of Speech and Drum Sounds in the North Indian Tabla Tradition: An Empirical Study of Sound Symbolism , 2003 .

[15]  Jean-Gabriel Ganascia,et al.  Musical content-based retrieval: an overview of the Melodiscov approach and system , 1999, MULTIMEDIA '99.

[16]  Jr. G. Forney,et al.  The viterbi algorithm , 1973 .

[17]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[18]  M. Jabri,et al.  Robust principal component analysis , 2000, Neural Networks for Signal Processing X. Proceedings of the 2000 IEEE Signal Processing Society Workshop (Cat. No.00TH8501).

[19]  Donald Byrd,et al.  Problems of music information retrieval in the real world , 2002, Inf. Process. Manag..

[20]  Gaël Richard,et al.  Automatic transcription of drum loops , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[21]  Fabien Gouyon,et al.  Automatic labeling of unpitched percussion sounds , 2003 .

[22]  Eric D. Scheirer,et al.  Tempo and beat analysis of acoustic musical signals. , 1998, The Journal of the Acoustical Society of America.

[23]  William P. Birmingham,et al.  Johnny Can't Sing: A Comprehensive Error Model for Sung Music Queries , 2002, ISMIR.

[24]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[25]  J. S. Downie The MIR/MDL Evaluation Project White Paper Collection , 2002 .

[26]  Gaël Richard,et al.  Musical instrument recognition on solo performances , 2004, 2004 12th European Signal Processing Conference.

[27]  Yoichi Muraoka,et al.  A WWW-based Melody Retrieval System , 1998, ICMC.

[28]  Xavier Serra,et al.  Towards Instrument Segmentation for Music Content Description: a Critical Review of Instrument Classification Techniques , 2000, ISMIR.

[29]  Brian Christopher Smith,et al.  Query by humming: musical information retrieval in an audio database , 1995, MULTIMEDIA '95.

[30]  Anssi Klapuri,et al.  Recognition of acoustic noise mixtures by combined bottom-up and top-down processing , 2000, 2000 10th European Signal Processing Conference.

[31]  William P. Birmingham,et al.  HMM-based musical query retrieval , 2002, JCDL '02.

[32]  Anssi Klapuri,et al.  Conventional and periodic N-grams in the transcription of drum sequences , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[33]  Andreas Kornstädt,et al.  Themefinder: A web-based melodic search tool , 1998 .

[34]  Derry Fitzgerald,et al.  SUB-BAND INDEPENDENT SUBSPACE ANALYSIS FOR DRUM TRANSCRIPTION , 2002 .