Addressing Tempo Estimation Octave Errors in Electronic Music by Incorporating Style Information Extracted from Wikipedia

A frequently occurring problem of state-of-the-art tempo estimation algorithms is that the predicted tempo for a piece of music is a whole-number multiple or fraction of the tempo as perceived by humans (tempo octave errors). While often this is simply caused by shortcomings of the used algorithms, in certain cases, this problem can be attributed to the fact that the actual number of beats per minute (BPM) within a piece is not a listener’s only criterion to consider it being “fast” or “slow”. Indeed, it can be argued that the perceived style of music sets an expectation of tempo and therefore influences its perception. In this paper, we address the issue of tempo octave errors in the context of electronic music styles. We propose to incorporate stylistic information by means of probability density functions that represent tempo expectations for the individual music styles. In combination with a style classifier those probability density functions are used to choose the most probable BPM estimate for a sample. Our evaluation shows a considerable improvement of tempo estimation accuracy on the test dataset.

[1]  Ichiro Fujinaga,et al.  Fast vs Slow: Learning Tempo Octaves from User Data , 2010, ISMIR.

[2]  George Tzanetakis,et al.  An experimental comparison of audio tempo induction algorithms , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[3]  Iryna Gurevych,et al.  Extracting Lexical Semantic Knowledge from Wikipedia and Wiktionary , 2008, LREC.

[4]  Emilia Gómez,et al.  Comparative Evaluation and Combination of Audio Tempo Estimation Approaches , 2011, Semantic Audio.

[5]  Iryna Gurevych,et al.  Analysis of the Wikipedia Category Graph for NLP Applications , 2007 .

[6]  Daniel Gärtner Tempo Detection of Urban Music Using Tatum Grid Non Negative Matrix Factorization , 2013, ISMIR.

[7]  Jaakko Astola,et al.  Analysis of the meter of acoustic musical signals , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[8]  Mark Levy Improving Perceptual Tempo Estimation with Crowd-Sourced Annotations , 2011, ISMIR.

[9]  Nick Collins Towards a style-specific basis for computational beat tracking , 2006 .

[10]  Markus Schedl,et al.  ENHANCED BEAT TRACKING WITH CONTEXT-AWARE NEURAL NETWORKS , 2011 .

[11]  Peter Knees,et al.  USING BLOCK-LEVEL FEATURES FOR GENRE CLASSIFICATION , TAG CLASSIFICATION AND MUSIC SIMILARITY ESTIMATION , 2010 .

[12]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[13]  Vassilis Katsouros,et al.  Music tempo estimation and beat tracking by applying source separation and metrical relations , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[14]  Klaus Seyerlehner FUSING BLOCK-LEVEL FEATURES FOR MUSIC SIMILARITY ESTIMATION , 2010 .

[15]  Dirk Riehle,et al.  Design and implementation of the Sweble Wikitext parser: unlocking the structured data of Wikipedia , 2011, Int. Sym. Wikis.

[16]  Geoffroy Peeters Template-based estimation of tempo: using unsupervised or supervised learning to create better spectral templates , 2010 .

[17]  Eric D. Scheirer,et al.  Tempo and beat analysis of acoustic musical signals. , 1998, The Journal of the Acoustical Society of America.

[18]  Gerhard Widmer,et al.  From Rhythm Patterns to Perceived Tempo , 2007, ISMIR.

[19]  Anders Friberg,et al.  Modelling the Speed of Music using Features from Harmonic/Percussive Separated Audio , 2013, ISMIR.

[20]  Anssi Klapuri,et al.  Music Tempo Estimation With $k$-NN Regression , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[21]  François Pachet,et al.  A taxonomy of musical genres , 2000, RIAO.

[22]  Vassilis Katsouros,et al.  Reducing Tempo Octave Errors by Periodicity Vector Coding And SVM Learning , 2012, ISMIR.

[23]  Jens Lehmann,et al.  DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia , 2015, Semantic Web.