Instrumentation-based music similarity using sparse representations

This paper describes a novel music similarity calculation method that is based on the instrumentation of music pieces. The approach taken here is based on the idea that sparse representations of musical audio signals are a rich source of information regarding the elements that constitute the observed spectra. We propose a method to extract feature vectors based on sparse representations and use these to calculate a similarity measure between songs. To train a dictionary for sparse representations from a large amount of training data, a novel dictionary-initialization method based on agglomerative clustering is proposed. An objective evaluation shows that the new features improve the performance of similarity calculation compared to the standard mel-frequency cepstral coefficients features.

[1]  Constantine Kotropoulos,et al.  Sparse Multi-label Linear Embedding Within Nonnegative Tensor Factorization Applied to Music Tagging , 2010, ISMIR.

[2]  Hiromasa Fujihara,et al.  A Modeling of Singing Voice Robust to Accompaniment Sounds and Its Application to Singer Identification and Vocal-Timbre-Similarity-Based Music Information Retrieval , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[3]  Judith C. Brown,et al.  Non-Negative Matrix Factorization for Polyphonic Music Transcription Paris , 2003 .

[4]  Bob L. Sturm,et al.  On Similarity Search in Audio Signals Using Adaptive Sparse Approximations , 2009, Adaptive Multimedia Retrieval.

[5]  Patrik O. Hoyer,et al.  Non-negative sparse coding , 2002, Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing.

[6]  M. Elad,et al.  $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[7]  Rémi Gribonval,et al.  Sparse Representations in Audio and Music: From Coding to Source Separation , 2010, Proceedings of the IEEE.

[8]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[9]  David G. Stork,et al.  Pattern Classification , 1973 .

[10]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[11]  J. Mairal Sparse coding for machine learning, image processing and computer vision , 2010 .

[12]  J. Stephen Downie,et al.  The music information retrieval evaluation exchange (2005-2007): A window into music information retrieval research , 2008, Acoustical Science and Technology.

[13]  Daniel P. W. Ellis,et al.  A Large-Scale Evaluation of Acoustic and Subjective Music-Similarity Measures , 2004, Computer Music Journal.

[14]  Jaakko Astola,et al.  Analysis of the meter of acoustic musical signals , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[15]  George Tzanetakis,et al.  MARSYAS: a framework for audio analysis , 1999, Organised Sound.

[16]  A. Bruckstein,et al.  K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .

[17]  Masataka Goto,et al.  Musical Instrument Recognizer "Instrogram" and Its Application to Music Retrieval Based on Instrumentation Similarity , 2006, Eighth IEEE International Symposium on Multimedia (ISM'06).

[18]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[19]  Yannis Stylianou,et al.  Musical Genre Classification Using Nonnegative Matrix Factorization-Based Features , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[20]  Elias Pampalk,et al.  Computational Models of Music Similarity and their Application in Music Information Retrieval , 2006 .

[21]  Soo-Chang Pei,et al.  A novel music similarity measure system based on instrumentation analysis , 2009, 2009 IEEE International Conference on Multimedia and Expo.

[22]  David Heckerman,et al.  Empirical Analysis of Predictive Algorithms for Collaborative Filtering , 1998, UAI.