AdaMast: A Drum Sound Recognizer based on Adaptation and Matching of Spectrogram Templates

This paper describes a template-matching-based system, called AdaMast, that detects onset times of the bass drum, snare drum, and hi-hat cymbals in polyphonic audio signals of popular songs. AdaMast uses the power spectrograms of the drum sounds as templates. However, there are two main problems in transcribing drum sounds in the presence of other sounds. The first problem is that actual drum-sound spectrograms cannot be prepared as templates beforehand for each song. The second problem is that power spectrograms of sound mixtures including the drum sound are greatly different from the template (pure drum-sound spectrogram). To solve the first problem, a template-adaptation algorithm is built into AdaMast. To solve the second problem, a distance measure used in the template matching is designed to be robust to the spectral overlapping of other sounds. The test results in Audio Drum Detection Contest were 72.8%, 70.2%, and 57.4% in transcribing the bass drums, snare drums, and hi-hat cymbals, respectively, and AdaMast won the contest.

[1]  Jouni Paulus,et al.  Drum transcription with non-negative spectrogram factorisation , 2005, 2005 13th European Signal Processing Conference.

[2]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[3]  Christian Uhle,et al.  Further Steps towards Drum Transcription of Polyphonic Music , 2004 .

[4]  Michael A. Casey,et al.  Separation of Mixed Audio Sources By Independent Subspace Analysis , 2000, ICMC.

[5]  François Pachet,et al.  Automatic extraction of drum tracks from polyphonic music signals , 2002, Second International Conference on Web Delivering of Music, 2002. WEDELMUSIC 2002. Proceedings..

[6]  Anssi Klapuri,et al.  Conventional and periodic N-grams in the transcription of drum sequences , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[7]  Derry Fitzgerald,et al.  Drum Transcription in the presence of pitched instruments using Prior Subspace Analysis , 2003 .

[8]  Tuomas Virtanen,et al.  Sound Source Separation Using Sparse Coding with Temporal Continuity Objective , 2003, ICMC.

[9]  Marc Leman,et al.  Classification of Percussive Sounds using Support Vector Machines , 2004 .

[10]  Fabien Gouyon,et al.  Automatic Classification of Drum Sounds: A Comparison of Feature Selection Methods and Classification Techniques , 2002, ICMAI.

[11]  François Pachet,et al.  ON THE USE OF ZERO-CROSSING RATE FOR AN APPLICATION OF CLASSIFICATION OF PERCUSSIVE SOUNDS , 2000 .

[12]  Masataka Goto,et al.  Automatic Drum Sound Description for Real-World Music Using Template Adaptation and Matching Methods , 2004, ISMIR.

[13]  Eugene Coyle,et al.  Prior Subspace Analysis for Drum Transcription , 2003 .

[14]  Anssi Klapuri,et al.  MODEL-BASED EVENT LABELING IN THE TRANSCRIPTION OF PERCUSSIVE AUDIO SIGNALS , 2003 .

[15]  Gaël Richard,et al.  Automatic transcription of drum loops , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[16]  Derry Fitzgerald,et al.  SUB-BAND INDEPENDENT SUBSPACE ANALYSIS FOR DRUM TRANSCRIPTION , 2002 .

[17]  Christian Uhle,et al.  EXTRACTION OF DRUM TRACKS FROM POLYPHONIC MUSIC USING INDEPENDENT SUBSPACE ANALYSIS , 2003 .

[18]  Fabien Gouyon,et al.  Drum sound classification in polyphonic audio recordings using localized sound models , 2004, ISMIR.