Template-based estimation of tempo: using unsupervised or supervised learning to create better spectral templates

In this paper, we study tempo estimation using spectral templates coming from unsupervised or supervised learning given a database annotated into tempo. More precisely, we study the inclusion of these templates in our tempo estimation algorithm of [1]. For this, we consider as periodicity observation a 48-dimensions vector obtained by sampling the value of the amplitude of the DFT at tempo-related frequencies. We name it spectral template. A set of reference spectral templates is then learned in an unsupervised or supervised way from an annotated database. These reference spectral templates combined with all the possible tempo assumptions constitute the hidden states which we decode using a Viterbi algorithm. Experiments are then performed on the “ballroom dancer” test-set which allows concluding on improvement over state-ofthe-art. In particular, we discuss the use of prior tempo probabilities. It should be noted however that these results are only indicative considering that the training and test-set are the same in this preliminary experiment.

[1]  Masataka Goto,et al.  An Audio-based Real-time Beat Tracking System for Music With or Without Drum-sounds , 2001 .

[2]  Jean Laroche,et al.  Efficient Tempo and Beat Tracking in Audio Recordings , 2003 .

[3]  Geoffroy Peeters "Copy and scale" method for doing time-localized M.I.R. estimation:: application to beat-tracking , 2010, MML '10.

[4]  Anssi Klapuri,et al.  Measuring the similarity of Rhythmic Patterns , 2002, ISMIR.

[5]  George Tzanetakis,et al.  An experimental comparison of audio tempo induction algorithms , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[6]  Pedro Cano,et al.  Pulse-dependent analyses of percussive music , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[7]  Anssi Klapuri,et al.  Music Tempo Estimation With $k$-NN Regression , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[8]  Anssi Klapuri,et al.  Sound onset detection by applying psychoacoustic knowledge , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[9]  Simon Dixon,et al.  Automatic Extraction of Tempo and Beat From Expressive Performances , 2001 .

[10]  Geoffroy Peeters,et al.  Template-Based Estimation of Time-Varying Tempo , 2007, EURASIP J. Adv. Signal Process..

[11]  Jaakko Astola,et al.  Analysis of the meter of acoustic musical signals , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[12]  Eric D. Scheirer,et al.  Tempo and beat analysis of acoustic musical signals. , 1998, The Journal of the Acoustical Society of America.

[13]  Simon Dixon,et al.  A Review of Automatic Rhythm Description Systems , 2005, Computer Music Journal.

[14]  Geoffroy Peeters Spectral and Temporal Periodicity Representations of Rhythm for the Automatic Classification of Music Audio Signal , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[15]  George Tzanetakis,et al.  Analyzing Afro-Cuban Rhythms using Rotation-Aware Clave Template Matching with Dynamic Programming , 2008, ISMIR.