Music Genre Classification via Joint Sparse Low-Rank Representation of Audio Features

A novel framework for music genre classification, namely the joint sparse low-rank representation (JSLRR) is proposed in order to: 1) smooth the noise in the test samples, and 2) identify the subspaces that the test samples lie onto. An efficient algorithm is proposed for obtaining the JSLRR and a novel classifier is developed, which is referred to as the JSLRR-based classifier. Special cases of the JSLRR-based classifier are the joint sparse representation-based classifier and the low-rank representation-based one. The performance of the three aforementioned classifiers is compared against that of the sparse representation-based classifier, the nearest subspace classifier, the support vector machines, and the nearest neighbor classifier for music genre classification on six manually annotated benchmark datasets. The best classification results reported here are comparable with or slightly superior than those obtained by the state-of-the-art music genre classification methods.

[1]  Constantine Caramanis,et al.  Robust PCA via Outlier Pursuit , 2010, IEEE Transactions on Information Theory.

[2]  Christian Schörkhuber CONSTANT-Q TRANSFORM TOOLBOX FOR MUSIC PROCESSING , 2010 .

[3]  Jieping Xu,et al.  Notice of RetractionMulti-modal music genre classification approach , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[4]  John Wright,et al.  Dense Error Correction Via $\ell^1$-Minimization , 2010, IEEE Transactions on Information Theory.

[5]  Mohammed Bennamoun,et al.  Linear Regression for Face Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Beth Logan,et al.  Mel Frequency Cepstral Coefficients for Music Modeling , 2000, ISMIR.

[7]  Anton van den Hengel,et al.  Semidefinite Programming , 2014, Computer Vision, A Reference Guide.

[8]  Klaus Seyerlehner FUSING BLOCK-LEVEL FEATURES FOR MUSIC SIMILARITY ESTIMATION , 2010 .

[9]  Arthur Flexer,et al.  Effects of Album and Artist Filters in Audio Similarity Computed for Very Large Music Databases , 2010, Computer Music Journal.

[10]  Constantine Kotropoulos,et al.  Non-Negative Multilinear Principal Component Analysis of Auditory Temporal Modulations for Music Genre Classification , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[11]  Yong Yu,et al.  Robust Recovery of Subspace Structures by Low-Rank Representation , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[13]  Wolfgang Nejdl,et al.  Improving music genre classification using collaborative tagging data , 2009, WSDM '09.

[14]  Zhouyu Fu,et al.  A Survey of Audio-Based Music Classification and Annotation , 2011, IEEE Transactions on Multimedia.

[15]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Rainer Martin,et al.  Musical genre classification based on a highly-resolved cepstral modulation spectrum , 2010, 2010 18th European Signal Processing Conference.

[17]  Pablo A. Parrilo,et al.  Rank-Sparsity Incoherence for Matrix Decomposition , 2009, SIAM J. Optim..

[18]  Jakob Abeßer,et al.  From Multi-Labeling to Multi-Domain-Labeling: A Novel Two-Dimensional Approach to Music Genre Classification , 2009, ISMIR.

[19]  Anssi Klapuri,et al.  Automatic Transcription of Melody, Bass Line, and Chords in Polyphonic Music , 2008, Computer Music Journal.

[20]  Constantine Kotropoulos,et al.  Music classification by low-rank semantic mappings , 2013, EURASIP J. Audio Speech Music. Process..

[21]  Lihe Zhang,et al.  Low-rank decomposition and Laplacian group sparse coding for image classification , 2014, Neurocomputing.

[22]  Yannis Stylianou,et al.  Musical Genre Classification Using Nonnegative Matrix Factorization-Based Features , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[23]  Thierry Bertin-Mahieux,et al.  Automatic Tagging of Audio: The State-of-the-Art , 2011 .

[24]  Nima Mesgarani,et al.  Discrimination of speech from nonspeech based on multiscale spectro-temporal Modulations , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[25]  Geoffroy Peeters Rhythm Classification Using Spectral Rhythm Patterns , 2005, ISMIR.

[26]  N. Scaringella,et al.  Automatic genre classification of music content: a survey , 2006, IEEE Signal Process. Mag..

[27]  Gerhard Widmer,et al.  Improvements of Audio-Based Music Similarity and Genre Classificaton , 2005, ISMIR.

[28]  Nicolas Vayatis,et al.  Estimation of Simultaneously Sparse and Low Rank Matrices , 2012, ICML.

[29]  Junfeng Yang,et al.  A Fast Algorithm for Edge-Preserving Variational Multichannel Image Restoration , 2009, SIAM J. Imaging Sci..

[30]  Fernando Diaz-de-Maria,et al.  Music genre classification using the temporal structure of songs , 2010, 2010 IEEE International Workshop on Machine Learning for Signal Processing.

[31]  Gerhard Widmer,et al.  Towards Characterisation of Music via Rhythmic Patterns , 2004, ISMIR.

[32]  Christian Osendorfer,et al.  Unsupervised learning of low-level audio features for music similarity estimation , 2011, ICML 2011.

[33]  Volkan Cevher,et al.  Low-Dimensional Models for Dimensionality Reduction and Signal Recovery: A Geometric Perspective , 2010, Proceedings of the IEEE.

[34]  Junfeng Yang,et al.  Linearized augmented Lagrangian and alternating direction methods for nuclear norm minimization , 2012, Math. Comput..

[35]  E Tsunoo,et al.  Beyond Timbral Statistics: Improving Music Classification Using Percussive Patterns and Bass Lines , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[36]  Pierre Vandergheynst,et al.  Hyperspectral image compressed sensing via low-rank and joint-sparse matrix recovery , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[37]  Dimitri P. Bertsekas,et al.  Constrained Optimization and Lagrange Multiplier Methods , 1982 .

[38]  Kun-Ming Yu,et al.  Automatic Music Genre Classification Based on Modulation Spectral Analysis of Spectral and Cepstral Features , 2009, IEEE Transactions on Multimedia.

[39]  George Tzanetakis,et al.  Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..

[40]  Peter Bühlmann Regression shrinkage and selection via the Lasso: a retrospective (Robert Tibshirani): Comments on the presentation , 2011 .

[41]  D. Donoho For most large underdetermined systems of equations, the minimal 𝓁1‐norm near‐solution approximates the sparsest near‐solution , 2006 .

[42]  Biing-Hwang Juang,et al.  Auditory perception and cognition , 2008, IEEE Signal Processing Magazine.

[43]  René Vidal,et al.  Sparse Subspace Clustering: Algorithm, Theory, and Applications , 2012, IEEE transactions on pattern analysis and machine intelligence.

[44]  Licheng Jiao,et al.  A simplified low rank and sparse graph for semi-supervised learning , 2014, Neurocomputing.

[45]  Balas K. Natarajan,et al.  Sparse Approximate Solutions to Linear Systems , 1995, SIAM J. Comput..

[46]  Zuowei Shen,et al.  Robust Video Restoration by Joint Sparse and Low Rank Matrix Approximation , 2011, SIAM J. Imaging Sci..

[47]  Katharina Morik,et al.  A Benchmark Dataset for Audio Classification and Clustering , 2005, ISMIR.

[48]  Simon Dixon,et al.  Dance music classification: A tempo-based approach , 2004, ISMIR.

[49]  R. Tibshirani,et al.  Regression shrinkage and selection via the lasso: a retrospective , 2011 .

[50]  Yonina C. Eldar,et al.  Rank Awareness in Joint Sparse Recovery , 2010, IEEE Transactions on Information Theory.

[51]  Jyh-Shing Roger Jang,et al.  Music Genre Classification via Compressive Sampling , 2010, ISMIR.

[52]  Bob L. Sturm A Survey of Evaluation in Music Genre Recognition , 2012, Adaptive Multimedia Retrieval.

[53]  Yi Ma,et al.  Repairing Sparse Low-Rank Texture , 2012, ECCV.

[54]  Yi Ma,et al.  Robust principal component analysis? , 2009, JACM.

[55]  François Pachet,et al.  Representing Musical Genre: A State of the Art , 2003 .

[56]  I. Daubechies,et al.  An iterative thresholding algorithm for linear inverse problems with a sparsity constraint , 2003, math/0307152.

[57]  Douglas Eck,et al.  Scalable Genre and Tag Prediction with Spectral Covariance , 2010, ISMIR.

[58]  MaYi,et al.  Dense error correction via l1-minimization , 2010 .

[59]  Daniel P. W. Ellis,et al.  Song-Level Features and Support Vector Machines for Music Classification , 2005, ISMIR.

[60]  Constantine Kotropoulos,et al.  Automatic music mood classification via Low-Rank Representation , 2011, 2011 19th European Signal Processing Conference.

[61]  Ali Shokoufandeh,et al.  Music genre classification using explicit semantic analysis , 2011, MIRUM '11.

[62]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[63]  Narendra Ahuja,et al.  Low-Rank Sparse Learning for Robust Visual Tracking , 2012, ECCV.

[64]  Steven Salzberg,et al.  On Comparing Classifiers: Pitfalls to Avoid and a Recommended Approach , 1997, Data Mining and Knowledge Discovery.

[65]  Allen Y. Yang,et al.  Fast L1-Minimization Algorithms For Robust Face Recognition , 2010, 1007.3753.

[66]  Peter Knees,et al.  USING BLOCK-LEVEL FEATURES FOR GENRE CLASSIFICATION , TAG CLASSIFICATION AND MUSIC SIMILARITY ESTIMATION , 2010 .