论文信息 - A Combination of Hand-Crafted and Hierarchical High-Level Learnt Feature Extraction for Music Genre Classification

A Combination of Hand-Crafted and Hierarchical High-Level Learnt Feature Extraction for Music Genre Classification

In this paper, we propose a new approach for automatic music genre classification which relies on learning a feature hierarchy with a deep learning architecture over hand-crafted feature extracted from an audio signal. Unlike the state-of-the-art approaches, our scheme uses an unsupervised learning algorithm based on Deep Belief Networks (DBN) learnt on block-wise MFCC (that we treat as 2D images), followed by a supervised learning algorithm for fine-tuning the extracted features. Experiments performed on the GTZAN dataset show that the proposed scheme clearly outperforms the state-of-the-art approaches.

Christophe Garcia | Khalid Idrissi | Toru Nakashika | Julien N. P. Martel

[1] Andreas Rauber,et al. Evaluation of Feature Extractors and Psycho-Acoustic Transformations for Music Genre Classification , 2005, ISMIR.

[2] Constantine Kotropoulos,et al. Music Genre Classification Using Locality Preserving Non-Negative Tensor Factorization and Sparse Representations , 2009, ISMIR.

[3] Tao Li,et al. A comparative study on content-based music genre classification , 2003, SIGIR.

[4] Yann LeCun,et al. Unsupervised Learning of Sparse Features for Scalable Audio Classification , 2011, ISMIR.

[5] Michael S. Lewicki,et al. Efficient auditory coding , 2006, Nature.

[6] Marc'Aurelio Ranzato,et al. A Unified Energy-Based Framework for Unsupervised Learning , 2007, AISTATS.

[7] Peter Knees,et al. USING BLOCK-LEVEL FEATURES FOR GENRE CLASSIFICATION , TAG CLASSIFICATION AND MUSIC SIMILARITY ESTIMATION , 2010 .

[8] Douglas Eck,et al. Learning Features from Music Audio with Deep Belief Networks , 2010, ISMIR.

[9] George Tzanetakis,et al. Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..

[10] Douglas Eck,et al. Scalable Genre and Tag Prediction with Spectral Covariance , 2010, ISMIR.

[11] Antoni B. Chan,et al. Genre Classification and the Invariance of MFCC Features to Key and Tempo , 2011, MMM.

[12] Tao Li,et al. Factors in automatic musical genre classification of audio signals , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).

[13] Douglas Eck,et al. Aggregate features and ADABOOST for music classification , 2006, Machine Learning.

[14] Geoffrey E. Hinton,et al. To recognize shapes, first learn to generate images. , 2007, Progress in brain research.

[15] Peter Knees,et al. Automatic Music Tag Classification Based On Block-Level Features , 2010 .

[16] Luiz S. Oliveira,et al. Music genre recognition using spectrograms , 2011, 2011 18th International Conference on Systems, Signals and Image Processing.

[17] Jiao Licheng. Research on Computation of GLCM of Image Texture , 2006 .

[18] Jyh-Shing Roger Jang,et al. Music Genre Classification via Compressive Sampling , 2010, ISMIR.

[19] Constantine Kotropoulos,et al. Music genre classification via Topology Preserving Non-Negative Tensor Factorization and sparse representations , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[20] Antoni B. Chan,et al. Automatic Musical Pattern Feature Extraction Using Convolutional Neural Network , 2010 .