Music Genre Classification Using Spectral Analysis and Sparse Representation of the Signals

In this paper, we proposed a robust music genre classification method based on a sparse FFT based feature extraction method which extracted with discriminating power of spectral analysis of non-stationary audio signals, and the capability of sparse representation based classifiers. Feature extraction method combines two sets of features namely short-term features (extracted from windowed signals) and long-term features (extracted from combination of extracted short-time features). Experimental results demonstrate that the proposed feature extraction method leads to a sparse representation of audio signals. As a result, a significant reduction in the dimensionality of the signals is achieved. The extracted features are then fed into a sparse representation based classifier (SRC). Our experimental results on the GTZAN database demonstrate that the proposed method outperforms the other state of the art SRC approaches. Moreover, the computational efficiency of the proposed method is better than that of the other Compressive Sampling (CS)-based classifiers.

[1]  Les E. Atlas,et al.  Non-stationary signal classification using joint frequency analysis , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[2]  George Tzanetakis,et al.  Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..

[3]  Tara N. Sainath,et al.  Bayesian compressive sensing for phonetic classification , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[4]  Chang Dong Yoo,et al.  Music genre classification using novel features and a weighted voting method , 2008, 2008 IEEE International Conference on Multimedia and Expo.

[5]  Pierre Vandergheynst,et al.  Compressed Sensing: “When Sparsity Meets Sampling” , 2011 .

[6]  Jyh-Shing Roger Jang,et al.  Music Genre Classification via Compressive Sampling , 2010, ISMIR.

[7]  Yannis Manolopoulos,et al.  Audio Indexing for Efficient Music Information Retrieval , 2005, 11th International Multimedia Modelling Conference.

[8]  Constantine Kotropoulos,et al.  Music genre classification via Topology Preserving Non-Negative Tensor Factorization and sparse representations , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[9]  John Saunders,et al.  Real-time discrimination of broadcast speech/music , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[10]  V. T. Ruoppila,et al.  Combined speech and audio coding by discrimination , 2000, 2000 IEEE Workshop on Speech Coding. Proceedings. Meeting the Challenges of the New Millennium (Cat. No.00EX421).

[11]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Douglas Keislar,et al.  Content-Based Classification, Search, and Retrieval of Audio , 1996, IEEE Multim..