Unsupervised Music Structure Annotation by Time Series Structure Features and Segment Similarity

Automatically inferring the structural properties of raw multimedia documents is essential in today's digitized society. Given its hierarchical and multi-faceted organization, musical pieces represent a challenge for current computational systems. In this article, we present a novel approach to music structure annotation based on the combination of structure features with time series similarity. Structure features encapsulate both local and global properties of a time series, and allow us to detect boundaries between homogeneous, novel, or repeated segments. Time series similarity is used to identify equivalent segments, corresponding to musically meaningful parts. Extensive tests with a total of five benchmark music collections and seven different human annotations show that the proposed approach is robust to different ground truth choices and parameter settings. Moreover, we see that it outperforms previous approaches evaluated under the same framework.

[1]  Meinard Müller,et al.  Audio-based Music Structure Analysis , 2010 .

[2]  Florian Kaiser MIREX 2013-MUSIC STRUCTURAL SEGMENTATION TASK : IRCAMSTRUCTURE SUBMISSION , 2012 .

[3]  Sarah Blake,et al.  Form , 2016, The Fairchild Books Dictionary of Fashion.

[4]  Masataka Goto,et al.  Music Structure Analysis from Acoustic Signals , 2008 .

[5]  Jürgen Kurths,et al.  Recurrence plots for the analysis of complex systems , 2009 .

[6]  Ron J. Weiss,et al.  Unsupervised Discovery of Temporal Structure in Music , 2011, IEEE Journal of Selected Topics in Signal Processing.

[7]  Daniel P. W. Ellis,et al.  Signal Processing for Music Analysis , 2011, IEEE Journal of Selected Topics in Signal Processing.

[8]  P. Ball The Music Instinct: How Music Works and Why We Can't Do Without It , 2010 .

[9]  Emmanuel Vincent,et al.  A Regularity-Constrained Viterbi Algorithm and Its Application to The Structural Segmentation of Songs , 2011, ISMIR.

[10]  Meinard Müller,et al.  Towards Structural Analysis of Audio Recordings in the Presence of Musical Variations , 2007, EURASIP J. Adv. Signal Process..

[11]  H. Kantz,et al.  Nonlinear time series analysis , 1997 .

[12]  Peter Grosche,et al.  High resolution audio synchronization using chroma onset features , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[13]  Geoffroy Peeters Deriving Musical Structures from Signal Analysis for Music Audio Summary Generation: "Sequence" and "State" Approach , 2003, CMMR.

[14]  Andreas Rauber,et al.  Automatic Audio Segmentation: Segment Boundary and Structure Detection in Popular Music , 2008 .

[15]  J. Simonoff Smoothing Methods in Statistics , 1998 .

[16]  Ning Hu,et al.  Pattern Discovery Techniques for Music Audio , 2002, ISMIR.

[17]  Emilia Gómez,et al.  Tonal Description of Polyphonic Audio for Music Content Processing , 2006, INFORMS J. Comput..

[18]  Antoni B. Chan,et al.  Modeling Music as a Dynamic Texture , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[19]  Lie Lu,et al.  Repeating pattern discovery and structure analysis from acoustic music data , 2004, MIR '04.

[20]  Aniruddh D. Patel Music, Language, and the Brain , 2007 .

[21]  Hanna M. Lukashevich Towards Quantitative Measures of Evaluating Song Segmentation , 2008, ISMIR.

[22]  Meinard Müller,et al.  Information retrieval for music and motion , 2007 .

[23]  Pierre Hanna,et al.  STRUCTURAL ANALYSIS OF HARMONIC FEATURES USING STRING MATCHING TECHNIQUES , 2012 .

[24]  Guillaume Boutard,et al.  Browsing inside a Music Track, the Experimentation Case Study , 2006 .

[25]  Geoffroy Peeters Sequence Representation of Music Structure Using Higher-Order Similarity Matrix and Maximum-Likelihood Approach , 2007, ISMIR.

[26]  R. Andrzejak,et al.  Cross recurrence quantification for cover song identification , 2009 .

[27]  Masataka Goto,et al.  A Supervised Approach for Detecting Boundaries in Music Using Difference Features and Boosting , 2007, ISMIR.

[28]  Peter Grosche,et al.  A Segment-Based Fitness Measure for Capturing Repetitive Structures of Music Recordings , 2011, ISMIR.

[29]  Peter Grosche,et al.  Unsupervised Detection of Music Boundaries by Time Series Structure Features , 2012, AAAI.

[30]  Simon Dixon,et al.  10 th International Society for Music Information Retrieval Conference ( ISMIR 2009 ) USING MUSICAL STRUCTURE TO ENHANCE AUTOMATIC CHORD TRANSCRIPTION , 2009 .

[31]  Masataka Goto,et al.  RWC Music Database: Popular, Classical and Jazz Music Databases , 2002, ISMIR.

[32]  Namunu Chinthaka Maddage Automatic structure detection for popular music , 2006, IEEE Multimedia.

[33]  Mert Bay,et al.  The Music Information Retrieval Evaluation eXchange: Some Observations and Insights , 2010, Advances in Music Information Retrieval.

[34]  Mark B. Sandler,et al.  Structural Segmentation of Musical Audio by Constrained Clustering , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[35]  Kristoffer Jensen,et al.  Multiple Scale Music Segmentation Using Rhythm, Timbre, and Harmony , 2007, EURASIP J. Adv. Signal Process..

[36]  Anssi Klapuri,et al.  Music Structure Analysis Using a Probabilistic Fitness Measure and a Greedy Search Algorithm , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[37]  Emmanuel Deruty,et al.  Is Music Structure Annotation Multi-dimensional ? A Proposal for Robust Local Music Annotation , 2009 .

[38]  Xavier Serra,et al.  Chroma Binary Similarity and Local Alignment Applied to Cover Song Identification , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[39]  Jonathan Foote,et al.  Automatic audio segmentation using a measure of audio novelty , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[40]  Masataka Goto,et al.  A chorus section detection method for musical audio signals and its application to a music listening station , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[41]  B. Ong Structural analysis and segmentation of music signals , 2007 .

[42]  Ming Li,et al.  Music Structural Segmentation by Combining Harmonic and Timbral Information , 2011, ISMIR.

[43]  Emmanuel Vincent,et al.  Semiotic Structure Labeling of Music Pieces: Concepts, Methods and Annotation Conventions , 2012, ISMIR.