Drum transcription using partially fixed non-negative matrix factorization

In this paper, a drum transcription algorithm using partially fixed non-negative matrix factorization is presented. The proposed method allows users to identify percussive events in complex mixtures with a minimal training set. The algorithm decomposes the music signal into two parts: percussive part with pre-defined drum templates and harmonic part with undefined entries. The harmonic part is able to adapt to the music content, allowing the algorithm to work in polyphonic mixtures. Drum event times can be simply picked from the percussive activation matrix with onset detection. The system is efficient and robust even with a minimal training set. The recognition rates for the ENST dataset vary from 56.7 to 78.9% for three percussive instruments extracted from polyphonic music.

[1]  Bernard De Baets,et al.  AN ALGORITHM FOR DETECTING AND LABELING DRUM EVENTS IN POLYPHONIC MUSIC , 2005 .

[2]  Bhiksha Raj,et al.  Supervised and Semi-supervised Separation of Sounds from Single-Channel Mixtures , 2007, ICA.

[3]  Henry Lindsay-Smith DRUMKIT TRANSCRIPTION VIA CONVOLUTIVE NMF , 2012 .

[4]  Anssi Klapuri,et al.  Automatic music transcription: challenges and future directions , 2013, Journal of Intelligent Information Systems.

[5]  Tuomas Virtanen,et al.  Separation of drums from polyphonic music using non-negative matrix factorization and support vector machine , 2005, 2005 13th European Signal Processing Conference.

[6]  Masataka Goto,et al.  Drum Sound Recognition for Polyphonic Audio Signals by Adaptation and Matching of Spectrogram Templates With Harmonic Structure Suppression , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  Tillman Weyde,et al.  Automatic transcription of pitched and unpitched sounds from polyphonic music , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[8]  P. Smaragdis,et al.  Non-negative matrix factorization for polyphonic music transcription , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).

[9]  Derry Fitzgerald,et al.  Drum Transcription in the presence of pitched instruments using Prior Subspace Analysis , 2003 .

[10]  Masataka Goto,et al.  Automatic Drum Sound Description for Real-World Music Using Template Adaptation and Matching Methods , 2004, ISMIR.

[11]  Jouni Paulus,et al.  Drum transcription with non-negative spectrogram factorisation , 2005, 2005 13th European Signal Processing Conference.

[12]  Masataka Goto,et al.  Drumix: An Audio Player with Real-time Drum-part Rearrangement Functions for Active Music Listening , 2007 .

[13]  Shigeki Sagayama,et al.  Multipitch Analysis with Harmonic Nonnegative Matrix Approximation , 2007, ISMIR.

[14]  Alexander Lerch,et al.  An Introduction to Audio Content Analysis: Applications in Signal Processing and Music Informatics , 2012 .

[15]  Gil Weinberg,et al.  Interactive jamming with Shimon: A social robotic musician , 2009, 2009 4th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[16]  Simon Dixon,et al.  Drum Transcription via Classification of Bar-Level Rhythmic Patterns , 2014, ISMIR.

[17]  Gaël Richard,et al.  ENST-Drums: an extensive audio-visual database for drum signals processing , 2006, ISMIR.

[18]  Daniel Gärtner,et al.  Real-Time Transcription and Separation of Drum Recordings Based on NMF Decomposition , 2014, DAFx.

[19]  Arthur Flexer,et al.  Drum Transcription in Polyphonic Music Using Non-Negative Matrix Factorisation , 2007, ISMIR.

[20]  Fabien Gouyon,et al.  Automatic labeling of unpitched percussion sounds , 2003 .

[21]  Christian Dittmar DRUM DETECTION FROM POLYPHONIC AUDIO VIA DETAILED ANALYSIS OF THE TIME FREQUENCY DOMAIN , 2005 .

[22]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[23]  Anssi Klapuri,et al.  Drum Sound Detection in Polyphonic Music with Hidden Markov Models , 2009, EURASIP J. Audio Speech Music. Process..

[24]  Alexander Lerch An introduction to audio content analysis , 2012 .

[25]  Minje Kim,et al.  Nonnegative matrix partial co-factorization for drum source separation , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[26]  Paris Smaragdis,et al.  Static and Dynamic Source Separation Using Nonnegative Factorizations: A unified view , 2014, IEEE Signal Processing Magazine.

[27]  Tuomas Virtanen,et al.  Monaural Sound Source Separation by Nonnegative Matrix Factorization With Temporal Continuity and Sparseness Criteria , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[28]  Gaël Richard,et al.  Automatic transcription of drum loops , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[29]  Derry Fitzgerald,et al.  SUB-BAND INDEPENDENT SUBSPACE ANALYSIS FOR DRUM TRANSCRIPTION , 2002 .

[30]  Gaël Richard,et al.  Transcription and Separation of Drum Signals From Polyphonic Music , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[31]  Ieee Staff 2017 25th European Signal Processing Conference (EUSIPCO) , 2017 .

[32]  Jouni Paulus,et al.  Drum transcription from multichannel recordings with non-negative matrix factorization , 2009, 2009 17th European Signal Processing Conference.