Using Generic Summarization to Improve Music Information Retrieval Tasks

In order to satisfy processing time constraints, many music information retrieval (MIR) tasks process only a segment of the whole music signal. This may lead to decreasing performance, as the most important information for the tasks may not be in the processed segments. We leverage generic summarization algorithms, previously applied to text and speech, to summarize items in music datasets. These algorithms build summaries (both concise and diverse), by selecting appropriate segments from the input signal, also making them good candidates to summarize music. We evaluate the summarization process on binary and multiclass music genre classification tasks, by comparing the accuracy when using summarized datasets against the accuracy when using human-oriented summaries, continuous segments (the traditional method used for addressing the previously mentioned time constraints), and full songs of the original dataset. We show that GRASSHOPPER, LexRank, LSA, MMR, and a Support Sets-based centrality model improve classification performance when compared to selected baselines. We also show that summarized datasets lead to a classification performance whose difference is not statistically significant from using full songs. Furthermore, we make an argument stating the advantages of sharing summarized datasets for future MIR research.

[1]  Alexander H. Waibel,et al.  Minimizing Word Error Rate in Textual Summaries of Spoken Language , 2000, ANLP.

[2]  Ewa Łukasik,et al.  Automatic Music Summarization. A “Thumbnail” Approach , 2011 .

[3]  George Tzanetakis,et al.  MARSYAS: a framework for audio analysis , 1999, Organised Sound.

[4]  Björn W. Schuller,et al.  Recent developments in openSMILE, the munich open-source multimedia feature extractor , 2013, ACM Multimedia.

[5]  Wei Chai,et al.  Semantic segmentation and summarization of music: methods based on tonality and recurrent structure , 2006, IEEE Signal Processing Magazine.

[6]  Jean Carletta,et al.  Extractive summarization of meeting recordings , 2005, INTERSPEECH.

[7]  Dragomir R. Radev,et al.  LexRank: Graph-based Lexical Centrality as Salience in Text Summarization , 2004, J. Artif. Intell. Res..

[8]  Xavier Rodet,et al.  Toward Automatic Music Audio Summary Generation from Signal Analysis , 2002, ISMIR.

[9]  Gregory H. Wakefield,et al.  Audio thumbnailing of popular music using chroma-based representations , 2005, IEEE Transactions on Multimedia.

[10]  Ron J. Weiss,et al.  Identifying Repeated Patterns in Music Using Sparse Convolutive Non-negative Matrix Factorization , 2010, ISMIR.

[11]  Xiaojin Zhu,et al.  Improving Diversity in Ranking using Absorbing Random Walks , 2007, NAACL.

[12]  William B. March,et al.  MLPACK: a scalable C++ machine learning library , 2012, J. Mach. Learn. Res..

[13]  Xin Liu,et al.  Generic text summarization using relevance measure and latent semantic analysis , 2001, SIGIR '01.

[14]  Changsheng Xu,et al.  Automatic music classification and summarization , 2005, IEEE Transactions on Speech and Audio Processing.

[15]  Matthew Cooper,et al.  Summarizing popular music via structural similarity analysis , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).

[16]  Gert R. G. Lanckriet,et al.  Codebook-Based Audio Feature Representation for Music Information Retrieval , 2013, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[17]  Stephen M. Chu,et al.  MUSIC SUMMARY USING KEY PHRASES , 2000 .

[18]  Franz de Leon,et al.  USING TIMBRE MODELS FOR AUDIO CLASSIFICATION , 2013 .

[19]  T. Landauer,et al.  A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. , 1997 .

[20]  Jade Goldstein-Stewart,et al.  The Use of MMR, Diversity-Based Reranking for Reordering Documents and Producing Summaries , 1998, SIGIR Forum.

[21]  Jade Goldstein-Stewart,et al.  The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[22]  Jonathan Foote,et al.  Automatic Music Summarization via Similarity Analysis , 2002, ISMIR.

[23]  Ricardo Ribeiro,et al.  Revisiting Centrality-as-Relevance: Support Sets and Similarity as Geometric Proximity: Extended abstract , 2013, IJCAI.

[24]  J. Stephen Downie,et al.  The Music Information Retrieval Evaluation eXchange (MIREX) , 2006 .

[25]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[26]  Ricardo Ribeiro,et al.  On the Application of Generic Summarization Algorithms to Music , 2015, IEEE Signal Processing Letters.

[27]  J. Steinberger,et al.  Using Latent Semantic Analysis in Text Summarization and Summary Evaluation , 2004 .

[28]  Q Wu,et al.  A nutch-based method of real-time theme searching for massive data , 2015 .

[29]  W. Marsden I and J , 2012 .

[30]  Conrad Sanderson,et al.  Armadillo: An Open Source C++ Linear Algebra Library for Fast Prototyping and Computationally Intensive Experiments , 2010 .

[31]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[32]  Xavier Rodet,et al.  Signal-based Music Structure Discovery for Music Audio Summary Generation , 2003, ICMC.

[33]  Jyh-Shing Roger Jang,et al.  Combining Acoustic and Multilevel Visual Features for Music Genre Classification , 2015, TOMM.