Estimation of the reliability of multiple rhythm features extraction from a single descriptor

The design of systems for automatic audio feature extraction is a central aspect of the field of Music Information Retrieval. However, feature extraction systems often do not provide an indication of the reliability of the corresponding feature. Nevertheless, the provision of a reliability or confidence measure can be critical for the usage of a given feature in complex systems and real-world applications. In the present study we investigate the relationship between the entropy of a rhythmogram, which has been proposed as a descriptor of tempo salience in previous work, and the reliability of the extraction of multiple high level rhythm related features. The results show that this single descriptor is viable for simultaneously estimating the reliability of multiple rhythm features extraction. The results also provide quantitative insight that is consistent with qualitative observations extensively reported in the literature on a qualitative basis.

[1]  Peter Grosche,et al.  What Makes Beat Tracking Difficult? A Case Study on Chopin Mazurkas , 2010, ISMIR.

[2]  George Tzanetakis,et al.  HUMAN PERCEPTION AND COMPUTER EXTRACTION OF MUSICAL BEAT STRENGTH , 2002 .

[3]  Masataka Goto,et al.  An Audio-based Real-time Beat Tracking System for Music With or Without Drum-sounds , 2001 .

[4]  Daniel P. W. Ellis,et al.  Identifying `Cover Songs' with Chroma Features and Dynamic Programming Beat Tracking , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[5]  Matthew E. P. Davies,et al.  Selective Sampling for Beat Tracking Evaluation , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[6]  Peter Grosche,et al.  Cyclic tempogram—A mid-level tempo representation for musicsignals , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[7]  Meinard Müller,et al.  Novel audio features for capturing tempo salience in music recordings , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[8]  George Tzanetakis,et al.  Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..

[9]  Matthew E. P. Davies,et al.  Context-Dependent Beat Tracking of Musical Audio , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[10]  G. Widmer,et al.  MAXIMUM FILTER VIBRATO SUPPRESSION FOR ONSET DETECTION , 2013 .

[11]  José Fornari,et al.  Multi-Feature Modeling of Pulse Clarity: Design, Validation and Optimization , 2008, ISMIR.

[12]  Mark B. Sandler,et al.  A tutorial on onset detection in music signals , 2005, IEEE Transactions on Speech and Audio Processing.

[13]  Youngmoo E. Kim,et al.  Beat-Sync-Mash-Coder: A web application for real-time creation of beat-synchronous music mashups , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[14]  M. Sandler,et al.  EXTRACTION OF METRICAL STRUCTURE FROM MUSIC RECORDINGS , 2015 .

[15]  Juan Pablo Bello,et al.  A Robust Mid-Level Representation for Harmonic Content in Music Signals , 2005, ISMIR.

[16]  György Fazekas,et al.  On the use of the tempogram to describe audio content and its application to Music structural segmentation , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).