Segmenting music through the joint estimation of keys, chords and structural boundaries

In this paper, we introduce a new approach to music structure segmentation that is based on the joint estimation of structural segments, keys and chords in one probabilistic framework. More precisely, the boundaries of a structure segment are determined by detecting key changes and by utilizing the difference in prior probability of chord transitions according to their position in a structural segment. In contrast to many of the recent approaches to structural segmentation, this system does not work with self-similarity matrices, although it has been designed to integrate this kind of approach into the framework at a later stage. However, just the current version of the system, using only the estimated harmony, is already producing encouraging results, especially with respect to the precise localization of the boundaries.

[1]  Geoffroy Peeters,et al.  Simultaneous Beat and Downbeat-Tracking Using a Probabilistic Framework: Theory and Large-Scale Evaluation , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[2]  Fred Lerdahl,et al.  Tonal Pitch Space , 2001 .

[3]  Namunu Chinthaka Maddage Automatic structure detection for popular music , 2006, IEEE Multimedia.

[4]  Geoffroy Peeters,et al.  Adaptive Temporal Modeling of Audio Features in the Context of Music Structure Segmentation , 2012, Adaptive Multimedia Retrieval.

[5]  Jonathan Foote,et al.  Visualizing music and audio using self-similarity , 1999, MULTIMEDIA '99.

[6]  Tijl De Bie,et al.  An End-to-End Machine Learning System for Harmonic Analysis of Music , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  Julius O. Smith,et al.  A system for acoustic chord transcription and key extraction from audio using hidden Markov models trained on synthesized audio , 2008 .

[8]  D. Temperley The Cognition of Basic Musical Structures , 2001 .

[9]  Marc Leman,et al.  Improving the key extraction performance of a simultaneous local key and chord estimation system , 2011, 2011 IEEE International Conference on Multimedia and Expo.

[10]  Meinard Müller,et al.  Audio-based Music Structure Analysis , 2010 .

[11]  Marc Leman,et al.  Modeling musicological information as trigrams in a system for simultaneous chord and local key extraction , 2011, 2011 IEEE International Workshop on Machine Learning for Signal Processing.