Supervised and Unsupervised Sequence Modelling for Drum Transcription

We discuss in this paper two post-processings for drum transcription systems, which aim to model typical properties of drum sequences. Both methods operate on a symbolic representation of the sequence, which is obtained by quantizing the onsets of drum strokes on an optimal tatum grid, and by fusing the posterior probabilities produced by the drum transcription system. The first proposed method is a generalization of the N -gram model. We discuss several training and recognition strategies (style-dependent models, local models) in order to maximize the reliability and the specificity of the trained models. Alternatively, we introduce a novel unsupervised algorithm based on a complexity criterion, which finds the most regular and wellstructured sequence compatible with the acoustic scores produced by the transcription system. Both approaches are evaluated on a subset of the ENST-drums corpus, and yield performance improvements.

[1]  Gerhard Widmer,et al.  Music complexity measures predicting the listening experience , 2006 .

[2]  Ronald de Wolf,et al.  Algorithmic Clustering of Music Based on String Compression , 2004, Computer Music Journal.

[3]  Jouni Paulus,et al.  Unpitched Percussion Transcription , 2006 .

[4]  Abraham Lempel,et al.  Compression of individual sequences via variable-rate coding , 1978, IEEE Trans. Inf. Theory.

[5]  Ian H. Witten,et al.  The zero-frequency problem: Estimating the probabilities of novel events in adaptive text compression , 1991, IEEE Trans. Inf. Theory.

[6]  Masataka Goto,et al.  An Error Correction Framework Based on Drum Pattern Periodicity for Improving Drum Sound Detection , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[7]  Craig G. Nevill-Manning,et al.  Compression by induction of hierarchical grammars , 1994, Proceedings of IEEE Data Compression Conference (DCC'94).

[8]  Gaël Richard,et al.  Automatic Labelling of Tabla Signals , 2003 .

[9]  Gaël Richard,et al.  Automatic transcription of drum loops , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[10]  Anssi Klapuri,et al.  Conventional and periodic N-grams in the transcription of drum sequences , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[11]  Anssi Klapuri,et al.  Signal Processing Methods for the Automatic Transcription of Music , 2004 .

[12]  Gaël Richard,et al.  ENST-Drums: an extensive audio-visual database for drum signals processing , 2006, ISMIR.

[13]  M. Li,et al.  Melody Classification using a Similarity Metric based on Kolmogorov Complexity , 2004 .

[14]  Christian Uhle,et al.  ESTIMATION OF TEMPO, MICRO TIME AND TIME SIGNATURE FROM PERCUSSIVE MUSIC , 2003 .