Towards Multi-Purpose Spectral Rhythm Features: An Application to Dance Style, Meter and Tempo Estimation

This paper addresses the extraction of multipurpose spectral rhythm features that simultaneously tackle a variety of rhythm analysis tasks, namely, dance style classification, meter estimation, and tempo estimation. The term spectral rhythm features emanates from the origin of the extracted features, which is the periodicity function (PF), a spectral representation that encapsulates the salience of the rhythm frequencies. Two dimensionality reduction techniques applied on the PF to extract expressive and compact features are compared, namely, a linear transformation resulting from Principal Component Analysis and a nonlinear mapping derived from a Restricted Boltzmann Machine. Subsequently, the derived features were used as input to an SVM classifier for each task. Moreover, an additional method is proposed that reformulates the well-studied tempo estimation task as a combination of multiple binary classification sub-problems. Evaluation was performed on a large number of datasets demonstrating that the same set of features learned from the PF provide a robust rhythmic representation that achieved comparable results to the current state-of-the-art methods for the aforementioned tasks.

[1]  Orberto,et al.  Evaluation Methods for Musical Audio Beat Tracking Algorithms , 2009 .

[2]  George Tzanetakis,et al.  An experimental comparison of audio tempo induction algorithms , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[3]  Yoichi Muraoka,et al.  Musical understanding at the beat level: real-time beat tracking for audio signals , 1998 .

[4]  Petri Toiviainen,et al.  Autocorrelation in meter induction: the role of accent structure. , 2006, The Journal of the Acoustical Society of America.

[5]  Geoffroy Peeters,et al.  Perceptual tempo estimation using GMM-regression , 2012, MIRUM '12.

[6]  Geoffroy Peeters,et al.  Simultaneous Beat and Downbeat-Tracking Using a Probabilistic Framework: Theory and Large-Scale Evaluation , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  R. Parncutt A Perceptual Model of Pulse Salience and Metrical Accent in Musical Rhythms , 1994 .

[8]  Matthieu Cord,et al.  Biasing Restricted Boltzmann Machines to Manipulate Latent Selectivity and Sparsity , 2010, NIPS 2010.

[9]  Anssi Klapuri,et al.  Music Tempo Estimation With $k$-NN Regression , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[10]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[11]  Anssi Klapuri,et al.  Sound onset detection by applying psychoacoustic knowledge , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[12]  Simon Dixon,et al.  Automatic Extraction of Tempo and Beat From Expressive Performances , 2001 .

[13]  Daniel P. W. Ellis,et al.  Beat Tracking by Dynamic Programming , 2007 .

[14]  Anders Friberg,et al.  Modelling Perception of Speed in Music Audio , 2013 .

[15]  Marc Leman,et al.  Evaluation and Recommendation of Pulse and Tempo Annotation in Ethnic Music , 2013 .

[16]  D. Moelants Preferred tempo reconsidered. , 2002 .

[17]  Grosvenor W. Cooper,et al.  The Rhythmic Structure of Music , 1971 .

[18]  Andreas Rauber,et al.  Facilitating Comprehensive Benchmarking Experiments on the Million Song Dataset , 2012, ISMIR.

[19]  Sergios Theodoridis,et al.  Music meter and tempo tracking from raw polyphonic audio , 2004, ISMIR.

[20]  John F. Kolen,et al.  Resonance and the Perception of Musical Meter , 1994, Connect. Sci..

[21]  Ichiro Fujinaga,et al.  Fast vs Slow: Learning Tempo Octaves from User Data , 2010, ISMIR.

[22]  C. Drake,et al.  The development of rhythmic attending in auditory sequences: attunement, referent period, focal attending , 2000, Cognition.

[23]  Geoffroy Peeters Rhythm Classification Using Spectral Rhythm Patterns , 2005, ISMIR.

[24]  Klaus Frieler Beat and meter extraction using gaussified onsets , 2004, ISMIR.

[25]  Matthew E. P. Davies,et al.  Evaluation of Audio Beat Tracking and Music Tempo Extraction Algorithms , 2007 .

[26]  Thippur V. Sreenivas,et al.  Hierarchical Classification of Carnatic Music Forms , 2013, ISMIR.

[27]  Βασίλης Κατσούρος,et al.  Deploying Nonlinear Image Filters to Spectrogram for Harmonic/Percussive Separation , 2012 .

[28]  Geoffroy Peeters,et al.  Joint Estimation of Chords and Downbeats From an Audio Signal , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[29]  Mathieu Lagrange,et al.  Meter Class Profiles for Music Similarity and Retrieval , 2009, ISMIR.

[30]  George Tzanetakis,et al.  An effective, simple tempo estimation method based on self-similarity and regularity , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[31]  Simon Dixon,et al.  A Review of Automatic Rhythm Description Systems , 2005, Computer Music Journal.

[32]  George Tzanetakis,et al.  Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..

[33]  Gerhard Widmer,et al.  Towards Characterisation of Music via Rhythmic Patterns , 2004, ISMIR.

[34]  Gerhard Widmer,et al.  From Rhythm Patterns to Perceived Tempo , 2007, ISMIR.

[35]  Ajay Srinivasamurthy,et al.  Tracking the "Odd": Meter Inference in a Culturally Diverse Music Corpus , 2014, ISMIR.

[36]  Petri Toiviainen Digital archive of Finnish Folk Tunes , 2014 .

[37]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[38]  Markus Schedl,et al.  ENHANCED BEAT TRACKING WITH CONTEXT-AWARE NEURAL NETWORKS , 2011 .

[39]  Vassilis Katsouros,et al.  Reducing Tempo Octave Errors by Periodicity Vector Coding And SVM Learning , 2012, ISMIR.

[40]  Bob L. Sturm The State of the Art Ten Years After a State of the Art: Future Research in Music Information Retrieval , 2013, ArXiv.

[41]  Peter Desain,et al.  On tempo tracking: Tempogram Representation and Kalman filtering , 2000, ICMC.

[42]  Tijmen Tieleman,et al.  Training restricted Boltzmann machines using approximations to the likelihood gradient , 2008, ICML '08.

[43]  Guy Madison,et al.  Ratings of speed in real music as a function of both original and manipulated beat tempo. , 2010, Journal of the Acoustical Society of America.

[44]  George Tzanetakis,et al.  Streamlined Tempo Estimation Based on Autocorrelation and Cross-correlation With Pulses , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[45]  Simon Dixon,et al.  Dance music classification: A tempo-based approach , 2004, ISMIR.

[46]  Bob L. Sturm The GTZAN dataset: Its contents, its faults, their effects on evaluation, and its future use , 2013, ArXiv.

[47]  Eleni Lapidaki,et al.  Stability of Tempo Perception in Music Listening , 2000 .

[48]  Matthew E. P. Davies,et al.  Context-Dependent Beat Tracking of Musical Audio , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[49]  Vassilis Katsouros,et al.  Tempo Induction Using Filterbank Analysis and Tonal Features , 2010, ISMIR.

[50]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[51]  R. Jackendoff,et al.  A Generative Theory of Tonal Music , 1985 .

[52]  Peter Desain,et al.  A (De)Composable Theory of Rhythm Perception , 1992 .

[53]  Geoffroy Peeters,et al.  Template-Based Estimation of Time-Varying Tempo , 2007, EURASIP J. Adv. Signal Process..

[54]  Jaakko Astola,et al.  Analysis of the meter of acoustic musical signals , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[55]  Thierry Bertin-Mahieux,et al.  The Million Song Dataset , 2011, ISMIR.

[56]  Eric D. Scheirer,et al.  Tempo and beat analysis of acoustic musical signals. , 1998, The Journal of the Acoustical Society of America.

[57]  Vassilis Katsouros,et al.  Music tempo estimation and beat tracking by applying source separation and metrical relations , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[58]  Gerhard Widmer,et al.  Evaluating Rhythmic descriptors for Musical Genre Classification , 2004 .

[59]  L. V. Noorden,et al.  Resonance in the Perception of Musical Pulse , 1999 .

[60]  Mark Levy Improving Perceptual Tempo Estimation with Crowd-Sourced Annotations , 2011, ISMIR.

[61]  David Felix Rosenthal Machine rhythm: computer emulation of human rhythm perception , 1992 .

[62]  Florian Krebs,et al.  Rhythmic Pattern Modeling for Beat and Downbeat Tracking in Musical Audio , 2013, ISMIR.

[63]  Sergios Theodoridis,et al.  Music Retrieval by Rhythmic Similarity Applied on Greek and African Traditional Music , 2007, ISMIR.

[64]  Meinard Müller,et al.  Exploiting global features for tempo octave correction , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[65]  Douglas Eck,et al.  Finding Meter in Music Using An Autocorrelation Phase Matrix and Shannon Entropy , 2005, ISMIR.

[66]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.