Music segment similarity using 2D-Fourier Magnitude Coefficients

Music segmentation is the task of automatically identifying the different segments of a piece. In this work we present a novel approach to cluster the musical segments based on their acoustic similarity by using 2D-Fourier Magnitude Coefficients (2D-FMCs). These coefficients, computed from a chroma representation, significantly simplify the problem of clustering the different segments since they are key transposition and phase shift invariant. We explore various strategies to obtain the 2D-FMC patches that represent entire segments and apply k-means to label them. Finally, we discuss possible ways of estimating k and compare our competitive results with the current state of the art.

[1]  Sebastiano Battiato,et al.  Advanced Concepts for Intelligent Vision Systems , 2015, Lecture Notes in Computer Science.

[2]  Christian Schörkhuber CONSTANT-Q TRANSFORM TOOLBOX FOR MUSIC PROCESSING , 2010 .

[3]  Peter Grosche,et al.  A Robust Fitness Measure for Capturing Repetitions in Music Recordings With Applications to Audio Thumbnailing , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[4]  Simon Dixon,et al.  10 th International Society for Music Information Retrieval Conference ( ISMIR 2009 ) USING MUSICAL STRUCTURE TO ENHANCE AUTOMATIC CHORD TRANSCRIPTION , 2009 .

[5]  Andrew W. Moore,et al.  X-means: Extending K-means with Efficient Estimation of the Number of Clusters , 2000, ICML.

[6]  Ilias Theodorakopoulos,et al.  Unsupervised music segmentation via multi-scale processing of compressive features' representation , 2013, 2013 18th International Conference on Digital Signal Processing (DSP).

[7]  Ron J. Weiss,et al.  Unsupervised Discovery of Temporal Structure in Music , 2011, IEEE Journal of Selected Topics in Signal Processing.

[8]  Peter Grosche,et al.  Unsupervised Detection of Music Boundaries by Time Series Structure Features , 2012, AAAI.

[9]  Oriol Nieto,et al.  Convex non-negative matrix factorization for automatic music structure identification , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[10]  Anssi Klapuri,et al.  Music Structure Analysis Using a Probabilistic Fitness Measure and a Greedy Search Algorithm , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[11]  Jordan B. L. Smith,et al.  Audio Properties of Perceived Boundaries in Music , 2014, IEEE Transactions on Multimedia.

[12]  Hanna M. Lukashevich Towards Quantitative Measures of Evaluating Song Segmentation , 2008, ISMIR.

[13]  Daniel P. W. Ellis,et al.  Identifying `Cover Songs' with Chroma Features and Dynamic Programming Beat Tracking , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[14]  Thierry Bertin-Mahieux,et al.  Large-Scale Cover Song Recognition Using the 2D Fourier Transform Magnitude , 2012, ISMIR.

[15]  Meinard Müller,et al.  Converting Path Structures Into Block Structures Using Eigenvalue Decompositions of Self-Similarity Matrices , 2013, ISMIR.

[16]  Meinard Müller,et al.  Audio-based Music Structure Analysis , 2010 .

[17]  Martin F. McKinney,et al.  Perception of structural boundaries in popular music. , 2006 .

[18]  Pasi Fränti,et al.  Knee Point Detection in BIC for Detecting the Number of Clusters , 2008, ACIVS.

[19]  Oriol Nieto,et al.  Data Driven and Discriminative Projections for Large-Scale Cover Song Identification , 2013, ISMIR.

[20]  Thomas Sikora,et al.  Music Structure Discovery in Popular Music using Non-negative Matrix Factorization , 2010, ISMIR.

[21]  Meinard Müller Audio Structure Analysis , 2007 .

[22]  Jordan B. L. Smith,et al.  Design and creation of a large-scale database of structural annotations , 2011, ISMIR.

[23]  Meinard Müller,et al.  THE IMPORTANCE OF DETECTING BOUNDARIES IN MUSIC STRUCTURE ANNOTATION , 2012 .

[24]  Mark B. Sandler,et al.  Structural Segmentation of Musical Audio by Constrained Clustering , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[25]  Meinard Müller,et al.  Information retrieval for music and motion , 2007 .

[26]  Andreas Rauber,et al.  Automatic Audio Segmentation: Segment Boundary and Structure Detection in Popular Music , 2008 .