Predicting Agreement and Disagreement in the Perception of Tempo

In the absence of a music score, tempo can only be defined by its perception by users. Thus recent studies have focused on the estimation of perceptual tempo defined by listening experiments. So far, algorithms have only been proposed to estimate the tempo when people agree on it. In this paper, we study the case when people disagree on the perception of tempo and propose an algorithm to predict this disagreement. For this, we hypothesize that the perception of tempo is correlated to a set of variations of various viewpoints on the audio content: energy, harmony, spectral-balance variations and short-term-similarity-rate. We suppose that when those variations are coherent, a shared perception of tempo is favoured and when they are not, people may perceive different tempi.We then propose several statistical models to predict the agreement or disagreement in the perception of tempo from these audio features. Finally, we evaluate the models using a test-set resulting from the perceptual experiment performed at Last-FM in 2011.

[1]  Jonathan Foote,et al.  Automatic audio segmentation using a measure of audio novelty , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[2]  Dirk Moelants,et al.  Extracting the perceptual tempo from music , 2004, ISMIR.

[3]  Taoufik En-Najjary,et al.  A new method for pitch prediction from spectral envelope and its application in voice conversion , 2003, INTERSPEECH.

[4]  L. V. Noorden,et al.  Resonance in the Perception of Musical Pulse , 1999 .

[5]  Mark Levy Improving Perceptual Tempo Estimation with Crowd-Sourced Annotations , 2011, ISMIR.

[6]  Matthew E. P. Davies,et al.  Assigning a Confidence Threshold on Automatic Beat Annotation in Large Datasets , 2012, ISMIR.

[7]  Geoffroy Peeters,et al.  Template-Based Estimation of Time-Varying Tempo , 2007, EURASIP J. Adv. Signal Process..

[8]  Jean Laroche,et al.  Efficient Tempo and Beat Tracking in Audio Recordings , 2003 .

[9]  Geoffroy Peeters Sequence Representation of Music Structure Using Higher-Order Similarity Matrix and Maximum-Likelihood Approach , 2007, ISMIR.

[10]  George Tzanetakis,et al.  An experimental comparison of audio tempo induction algorithms , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[11]  Ichiro Fujinaga,et al.  Fast vs Slow: Learning Tempo Octaves from User Data , 2010, ISMIR.

[12]  Gerhard Widmer,et al.  From Rhythm Patterns to Perceived Tempo , 2007, ISMIR.

[13]  G. H. Wakefield,et al.  To catch a chorus: using chroma-based representations for audio thumbnailing , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[14]  Geoffroy Peeters,et al.  Perceptual tempo estimation using GMM-regression , 2012, MIRUM '12.

[15]  Geoffroy Peeters,et al.  Simultaneous Beat and Downbeat-Tracking Using a Probabilistic Framework: Theory and Large-Scale Evaluation , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[16]  Vassilis Katsouros,et al.  Reducing Tempo Octave Errors by Periodicity Vector Coding And SVM Learning , 2012, ISMIR.

[17]  Wen Li,et al.  Using Statistic Model to Capture the Association between Timbre and Perceived Tempo , 2008, ISMIR.

[18]  P. Flandrin,et al.  On the stationary phase approximation of chirp spectra , 1998, Proceedings of the IEEE-SP International Symposium on Time-Frequency and Time-Scale Analysis (Cat. No.98TH8380).

[19]  Patrick Flandrin,et al.  Time-Frequency/Time-Scale Analysis , 1998 .

[20]  Patrick Flandrin,et al.  Time-Frequency/Time-Scale Analysis, Volume 10 , 1998 .

[21]  Guojun Lu,et al.  Determination of Perceptual Tempo of Music , 2004, CMMR.

[22]  Jonathan Foote,et al.  Visualizing music and audio using self-similarity , 1999, MULTIMEDIA '99.

[23]  Søren Holdt Jensen,et al.  EURASIP Journal on Applied Signal Processing , 2005 .

[24]  Markus Cremer,et al.  Improving Perceived Tempo Estimation by Statistical Modeling of Higher-Level Musical Descriptors , 2009 .