论文信息 - Using quadratic programming to estimate feature relevance in structural analyses of music

Using quadratic programming to estimate feature relevance in structural analyses of music

To identify repeated patterns and contrasting sections in music, it is common to use self-similarity matrices (SSMs) to visualize and estimate structure. We introduce a novel application for SSMs derived from audio recordings: using them to learn about the potential reasoning behind a listener's annotation. We use SSMs generated by musically-motivated audio features at various timescales to represent contributions to a structural annotation. Since a listener's attention can shift among musical features (e.g., rhythm, timbre, and harmony) throughout a piece, we further break down the SSMs into section-wise components and use quadratic programming (QP) to minimize the distance between a linear sum of these components and the annotated description. We posit that the optimal section-wise weights on the feature components may indicate the features to which a listener attended when annotating a piece, and thus may help us to understand why two listeners disagreed about a piece's structure. We discuss some examples that substantiate the claim that feature relevance varies throughout a piece, using our method to investigate differences between listeners' interpretations, and lastly propose some variations on our method.

Jordan B. L. Smith | Elaine Chew

[1] Meinard Müller,et al. Audio-based Music Structure Analysis , 2010 .

[2] Matija Marolt,et al. A Mid-level Melody-based Representation for Calculating Audio Similarity , 2006, ISMIR.

[3] Geoffroy Peeters. Deriving Musical Structures from Signal Analysis for Music Audio Summary Generation: "Sequence" and "State" Approach , 2003, CMMR.

[4] Ag Armin Kohlrausch,et al. The perception of structural boundaries in melody lines of Western popular music , 2009 .

[5] Thomas Sikora,et al. Music Structure Discovery in Popular Music using Non-negative Matrix Factorization , 2010, ISMIR.

[6] A. Eronen,et al. CHORUS DETECTION WITH COMBINED USE OF MFCC AND CHROMA FEATURES AND IMAGE PROCESSING FILTERS , 2007 .

[7] Geoffroy Peeters. Sequence Representation of Music Structure Using Higher-Order Similarity Matrix and Maximum-Likelihood Approach , 2007, ISMIR.

[8] Mark B. Sandler,et al. Structural Segmentation of Multitrack Audio , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[9] D. Ruelle,et al. Recurrence Plots of Dynamical Systems , 1987 .

[10] Elias Pampalk. A Matlab Toolbox to Compute Music Similarity from Audio , 2004, ISMIR.

[11] Jonathan Foote,et al. Media segmentation using self-similarity decomposition , 2003, IS&T/SPIE Electronic Imaging.

[12] Meinard Müller,et al. Path-constrained partial music synchronization , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[13] Jordan B. L. Smith,et al. Design and creation of a large-scale database of structural annotations , 2011, ISMIR.

[14] C.-C. Jay Kuo,et al. Similarity matrix processing for music structure analysis , 2006, AMCMM '06.

[15] Jonathan Foote,et al. Visualizing music and audio using self-similarity , 1999, MULTIMEDIA '99.

[16] Masataka Goto,et al. A chorus section detection method for musical audio signals and its application to a music listening station , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[17] Carol L. Krumhansl,et al. Perceiving Musical Time , 1990 .

[18] B. Ong. Structural analysis and segmentation of music signals , 2007 .

[19] G. H. Wakefield,et al. To catch a chorus: using chroma-based representations for audio thumbnailing , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[20] Simon Dixon,et al. 10 th International Society for Music Information Retrieval Conference ( ISMIR 2009 ) USING MUSICAL STRUCTURE TO ENHANCE AUTOMATIC CHORD TRANSCRIPTION , 2009 .

[21] Geoffroy Peeters,et al. Adaptive Temporal Modeling of Audio Features in the Context of Music Structure Segmentation , 2012, Adaptive Multimedia Retrieval.

[22] Peter Grosche,et al. Structure-Based Audio Fingerprinting for Music Retrieval , 2012, ISMIR.

[23] Annabel J. Cohen,et al. Parsing of Melody: Quantification and Testing of the Local Grouping Rules of Lerdahl and Jackendoff's A Generative Theory of Tonal Music , 2004 .

[24] T. Jehan,et al. Hierarchical multi-class self similarities , 2005, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005..

[25] Ichiro Fujinaga,et al. Exploiting music structures for digital libraries , 2011, JCDL '11.

[26] Jonathan Foote,et al. Automatic audio segmentation using a measure of audio novelty , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[27] Gerhard Widmer,et al. Exploring Music Collections by Browsing Different Views , 2004, Computer Music Journal.

[28] A. Klapuri,et al. Music structure analysis by finding repeated parts , 2006, AMCMM '06.

[29] Elias Pampalk,et al. Content-based organization and visualization of music archives , 2002, MULTIMEDIA '02.

[30] Irfan A. Essa,et al. Feature Weighting for Segmentation , 2004, ISMIR.