Music Structure Segmentation Algorithm Evaluation: Expanding on MIREX 2010 Analyses and Datasets

Music audio structure segmentation has been a task in the Music Information Retrieval Evaluation eXchange (MIREX) since 2009. In 2010, five algorithms were evaluated against two datasets (297 and 100 songs) with an almost exclusive focus on western popular music. A new annotated dataset significantly larger in size and with a more diverse range of musical styles became available in 2011. This new dataset comprises over 1,300 songs spanning pop, jazz, classical, and world music styles. The algorithms from the 2010 iteration of MIREX are re-evaluated against this new dataset. This paper presents a detailed analysis of these evaluation results in order to gain a better understanding of the current state-of-the-art in automatic structure segmentation. These expanded analyses focus on the interaction of algorithm performance and rankings with datasets, musical styles, and annotation level. Because the new dataset contains multiple annotations for each song, we also introduce a baseline for expected human performance for this task.

[1]  Hanna M. Lukashevich Towards Quantitative Measures of Evaluating Song Segmentation , 2008, ISMIR.

[2]  J. Stephen Downie,et al.  The music information retrieval evaluation exchange (2005-2007): A window into music information retrieval research , 2008, Acoustical Science and Technology.

[3]  Ron J. Weiss,et al.  Identifying Repeated Patterns in Music Using Sparse Convolutive Non-negative Matrix Factorization , 2010, ISMIR.

[4]  Emmanuel Vincent,et al.  Decomposition Into Autonomous and Comparable Blocks: A Structural Description of Music Pieces , 2010, ISMIR.

[5]  Jordan B. L. Smith,et al.  Design and creation of a large-scale database of structural annotations , 2011, ISMIR.

[6]  Ichiro Fujinaga,et al.  Exploiting music structures for digital libraries , 2011, JCDL '11.

[7]  Anssi Klapuri,et al.  State of the Art Report: Audio-Based Music Structure Analysis , 2010, ISMIR.

[8]  Geoffroy Peeters Sequence Representation of Music Structure Using Higher-Order Similarity Matrix and Maximum-Likelihood Approach , 2007, ISMIR.

[9]  Masataka Goto,et al.  A Supervised Approach for Detecting Boundaries in Music Using Difference Features and Boosting , 2007, ISMIR.

[10]  Simon Dixon,et al.  10 th International Society for Music Information Retrieval Conference ( ISMIR 2009 ) USING MUSICAL STRUCTURE TO ENHANCE AUTOMATIC CHORD TRANSCRIPTION , 2009 .

[11]  Pierre Hanna,et al.  Indexing musical pieces using their major repetition , 2011, JCDL '11.

[12]  Masataka Goto,et al.  RWC Music Database: Popular, Classical and Jazz Music Databases , 2002, ISMIR.

[13]  Mark B. Sandler,et al.  Structural Segmentation of Musical Audio by Constrained Clustering , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[14]  Emmanuel Deruty,et al.  Is Music Structure Annotation Multi-dimensional ? A Proposal for Robust Local Music Annotation , 2009 .

[15]  Emmanuel Vincent,et al.  Un système de détection de rupture de timbre pour la description de la structure des morceaux de musique , 2010 .