Music Genre Classification based on Dynamical Models

This paper studies several alternatives to extract dynamical features from hidden Markov Models (HMMs) that are meaningful for music genre supervised classification. Songs are modelled using a three scale approach: a first stage of short term (milliseconds) features, followed by two layers of dynamical models: a multivariate AR that provides mid term (seconds) features for each song followed by an HMM stage that captures long term (song) features shared among similar songs. We study from an empirical point of view which features are relevant for the genre classification task. Experiments on a database including pieces of heavy metal, punk, classical and reggae music illustrate the advantages of each set of features.

[1]  J. Arenas-García,et al.  Discovering Music Structure via Similarity Fusion , 2007 .

[2]  Kaare Brandt Petersen,et al.  Mel Frequency Cepstral Coefficients: An Evaluation of Robustness of MP3 Encoded Music , 2006, ISMIR.

[3]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[4]  Zhouyu Fu,et al.  Music classification via the bag-of-features approach , 2011, Pattern Recognit. Lett..

[5]  Enric Guaus,et al.  Audio content processing for automatic music genre classification : descriptors, databases, and classifiers , 2009 .

[6]  Fernando Diaz-de-Maria,et al.  Music genre classification using the temporal structure of songs , 2010, 2010 IEEE International Workshop on Machine Learning for Signal Processing.

[7]  Ryan M. Rifkin,et al.  In Defense of One-Vs-All Classification , 2004, J. Mach. Learn. Res..

[8]  Emilio Parrado-Hernández,et al.  State-space dynamics distance for clustering sequential data , 2010, Pattern Recognit..

[9]  Jeroen Breebaart,et al.  Features for audio and music classification , 2003, ISMIR.

[10]  Lars Kai Hansen,et al.  Temporal Feature Integration for Music Genre Classification , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[11]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[12]  George Tzanetakis,et al.  Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..