A Study on Music Genre Classification Based on Universal Acoustic Models

Classification of musical genres gives a useful measure of similarity and is often the most useful descriptor of a musical piece. Previous techniques to use hidden Markov models (HMMs) for automatic genre classification have used a single HMM to model an entire song or genre. This paper provides a framework to give finer segmentation of HMMs through acoustic segment modeling. Modeling each of these acoustic segments with an HMM builds a timbral dictionary in the same fashion that one would create a phonetic dictionary for speech. A symbolic transcription is created by finding the most likely sequence of symbols. These transcriptions then serve as inputs into an efficient text classifier utilized to provide a solution to the genre classification problem. This paper demonstrates that language-ignorant approaches provide results that are consistent with the current state-of-the-art for the genre classification problem. However, the finer segmentation potentially allows for “musical language”-based syntactic rules to enhance performance.

[1]  Torbjørn Svendsen,et al.  On the automatic segmentation of speech signals , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  J.R. Bellegarda,et al.  Exploiting latent semantic information in statistical language modeling , 2000, Proceedings of the IEEE.

[3]  Thippur V. Sreenivas,et al.  Music instrument recognition: from isolated notes to solo phrases , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[5]  Gerhard Widmer,et al.  Improvements of Audio-Based Music Similarity and Genre Classificaton , 2005, ISMIR.

[6]  Frank K. Soong,et al.  A segment model based approach to speech recognition , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[7]  Stephen Cox,et al.  Features and classifiers for the automatic classification of musical audio signals , 2004, ISMIR.

[8]  François Pachet,et al.  A taxonomy of musical genres , 2000, RIAO.

[9]  George Kingsley Zipf,et al.  Human behavior and the principle of least effort , 1949 .

[10]  Robert M. Gray,et al.  An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..

[11]  Katharina Morik,et al.  A Benchmark Dataset for Audio Classification and Clustering , 2005, ISMIR.

[12]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[13]  Ming Li,et al.  Genre Classification via an LZ78-Based String Kernel , 2005, ISMIR.

[14]  John Shawe-Taylor,et al.  An Investigation of Feature Models for Music Genre Classification Using the Support Vector Classifier , 2005, ISMIR.

[15]  Lawrence R. Rabiner,et al.  A tutorial on Hidden Markov Models , 1986 .

[16]  Richard M. Schwartz,et al.  Adaptation to new microphones using tied-mixture normalization , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[17]  Masataka Goto,et al.  RWC Music Database: Popular, Classical and Jazz Music Databases , 2002, ISMIR.

[18]  Bin Ma,et al.  An acoustic segment modeling approach to automatic language identification , 2005, INTERSPEECH.

[19]  Giorgio Zoia,et al.  On the Modeling of Time Information for Automatic Genre Recognition Systems in Audio Signals , 2005, ISMIR.

[20]  François Pachet,et al.  Improving Timbre Similarity : How high’s the sky ? , 2004 .