Deep Belief Networks for Automatic Music Genre Classification

This paper proposes an approach to automatic music genre classification using deep belief networks. Based on the restricted Boltzmann machines, the deep belief networks is constructed and takes the acoustic features extracted through content-based analysis of music signals as input. The model parameters are initially determined after the deep belief network is trained by greedy layer-wise learning algorithm with feature vectors that are comprised of short-term and long-term features. Then the parameters are fine-tuned to local optimum according to back propagation algorithm. Experiments on GTZAN dataset show that the performance of music genre classification using deep belief networks is superior to those of widely used classification methods such as support vector machine, K-nearest neighbor, linear discriminant analysis and neural network.

[1]  Tao Li,et al.  Toward intelligent music information retrieval , 2006, IEEE Transactions on Multimedia.

[2]  George Tzanetakis,et al.  Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..

[3]  Chang Dong Yoo,et al.  Music genre classification using novel features and a weighted voting method , 2008, 2008 IEEE International Conference on Multimedia and Expo.

[4]  Yan Liu,et al.  Discriminative deep belief networks for visual data classification , 2011, Pattern Recognit..

[5]  Jyh-Shing Roger Jang,et al.  Music Genre Classification via Compressive Sampling , 2010, ISMIR.

[6]  Constantine Kotropoulos,et al.  Music genre classification via Topology Preserving Non-Negative Tensor Factorization and sparse representations , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[7]  Constantine Kotropoulos,et al.  Non-Negative Multilinear Principal Component Analysis of Auditory Temporal Modulations for Music Genre Classification , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[8]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[9]  Kun-Ming Yu,et al.  Automatic Music Genre Classification Based on Modulation Spectral Analysis of Spectral and Cepstral Features , 2009, IEEE Transactions on Multimedia.

[10]  Alberto Del Bimbo,et al.  Deep networks for audio event classification in soccer videos , 2009, 2009 IEEE International Conference on Multimedia and Expo.

[11]  Honglak Lee,et al.  Unsupervised feature learning for audio classification using convolutional deep belief networks , 2009, NIPS.

[12]  Thomas Hofmann,et al.  Greedy Layer-Wise Training of Deep Networks , 2007 .

[13]  Tao Li,et al.  A comparative study on content-based music genre classification , 2003, SIGIR.

[14]  Constantine Kotropoulos,et al.  Non-Negative Tensor Factorization Applied to Music Genre Classification , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[15]  Changshui Zhang,et al.  Content-Based Information Fusion for Semi-Supervised Music Genre Classification , 2008, IEEE Transactions on Multimedia.

[16]  Constantine Kotropoulos,et al.  Music genre classification via sparse representations of auditory temporal modulations , 2009, 2009 17th European Signal Processing Conference.