ℓ1-GRAPH BASED MUSIC STRUCTURE ANALYSIS

Anunsupervisedapproachforautomaticmusicstructureanalysis is proposed resorting to the following assumption: If the feature vectors extracted from a specific music segment are drawn from a single subspace, then the sequence of feature vectors extracted from a music recording will lie in a union of as many subspaces as the music segments in this recording are. It is well known that each feature vector stemming from a union of independent linear subspaces admits a sparse representation with respect to a dictionary formed by all other feature vectors with nonzero coefficients associated only to feature vectors that stem from its own subspace. Such sparse representation reveals the relationships among the feature vectors and it is used to construct a similarity graph, the so-called l1-graph. Accordingly, the segmentation of audio features is obtained by applying spectral clustering to the l1-graph. The performance of the just described approach is assessed by conducting experiments on the PopMusic and the UPF Beatles benchmark datasets. Promising results are reported.

[1]  Michael A. Saunders,et al.  Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[2]  René Vidal,et al.  Sparse subspace clustering , 2009, CVPR.

[3]  Ron J. Weiss,et al.  Identifying Repeated Patterns in Music Using Sparse Convolutive Non-negative Matrix Factorization , 2010, ISMIR.

[4]  Anssi Klapuri,et al.  Music Structure Analysis Using a Probabilistic Fitness Measure and a Greedy Search Algorithm , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[5]  Antoni B. Chan,et al.  Modeling Music as a Dynamic Texture , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[6]  Mark B. Sandler,et al.  Structural Segmentation of Musical Audio by Constrained Clustering , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  Richard F. Lyon,et al.  A computational model of filtering, detection, and compression in the cochlea , 1982, ICASSP.

[8]  Constantine Kotropoulos,et al.  Non-Negative Multilinear Principal Component Analysis of Auditory Temporal Modulations for Music Genre Classification , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[9]  Jitendra Malik,et al.  Normalized Cuts and Image Segmentation , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Meinard Müller,et al.  Audio-based Music Structure Analysis , 2010 .

[11]  Perfecto Herrera,et al.  Semantic Segmentation of Music audio Contents , 2005, ICMC.

[12]  Shuicheng Yan,et al.  Learning With $\ell ^{1}$-Graph for Image Analysis , 2010, IEEE Transactions on Image Processing.

[13]  Thomas Sikora,et al.  Music Structure Discovery in Popular Music using Non-negative Matrix Factorization , 2010, ISMIR.

[14]  John Wright,et al.  Dense Error Correction Via $\ell^1$-Minimization , 2010, IEEE Transactions on Information Theory.

[15]  Martin F. McKinney,et al.  Structural boundary perception in popular music , 2006, ISMIR.

[16]  Simon Dixon,et al.  10 th International Society for Music Information Retrieval Conference ( ISMIR 2009 ) USING MUSICAL STRUCTURE TO ENHANCE AUTOMATIC CHORD TRANSCRIPTION , 2009 .

[17]  Daniel P. W. Ellis,et al.  Signal Processing for Music Analysis , 2011, IEEE Journal of Selected Topics in Signal Processing.

[18]  Barry Vercoe,et al.  Structural analysis of musical signals for indexing and thumbnailing , 2003, 2003 Joint Conference on Digital Libraries, 2003. Proceedings..

[19]  D. Ellis Beat Tracking by Dynamic Programming , 2007 .

[20]  Masataka Goto,et al.  Music Structure Analysis from Acoustic Signals , 2008 .