Capturing the Temporal Domain in Echonest Features for Improved Classification Effectiveness

This paper proposes Temporal Echonest Features to harness the information available from the beat-aligned vector sequences of the features provided by The Echo Nest. Rather than aggregating them via simple averaging approaches, the statistics of temporal variations are analyzed and used to represent the audio content. We evaluate the performance on four traditional music genre classification test collections and compare them to state of the art audio descriptors. Experiments reveal, that the exploitation of temporal variability from beat-aligned vector sequences and combinations of different descriptors leads to an improvement of classification accuracy. Comparing the results of Temporal Echonest Features to those of approved conventional audio descriptors used as benchmarks, these approaches perform well, often significantly outperforming their predecessors, and can be effectively used for large scale music genre classification.

[1]  Pedro J. Ponce de León,et al.  Feature selection in a cartesian ensemble of feature subspace classifiers for music categorisation , 2010, MML '10.

[2]  Alessandro L. Koerich,et al.  The Latin Music Database , 2008, ISMIR.

[3]  George Tzanetakis,et al.  Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..

[4]  Zhouyu Fu,et al.  A Survey of Audio-Based Music Classification and Annotation , 2011, IEEE Transactions on Multimedia.

[5]  Beth Logan,et al.  Mel Frequency Cepstral Coefficients for Music Modeling , 2000, ISMIR.

[6]  George Tzanetakis,et al.  MARSYAS: a framework for audio analysis , 1999, Organised Sound.

[7]  Thierry Bertin-Mahieux,et al.  The Million Song Dataset , 2011, ISMIR.

[8]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[9]  Andreas Rauber,et al.  Evaluation of Feature Extractors and Psycho-Acoustic Transformations for Music Genre Classification , 2005, ISMIR.

[10]  Benjamin Schrauwen,et al.  Audio-based Music Classification with a Pretrained Convolutional Network , 2011, ISMIR.

[11]  Ichiro Fujinaga,et al.  Musical genre classification: Is it worth pursuing and how can it be improved? , 2006, ISMIR.

[12]  Andreas Rauber,et al.  Facilitating Comprehensive Benchmarking Experiments on the Million Song Dataset , 2012, ISMIR.

[13]  George Tzanetakis,et al.  Manipulation, analysis and retrieval systems for audio signals , 2002 .

[14]  Xavier Serra,et al.  ISMIR 2004 Audio Description Contest , 2006 .

[15]  Elias Pampalk,et al.  Please Scroll down for Article Journal of New Music Research the Som-enhanced Jukebox: Organization and Visualization of Music Collections Based on Perceptual Models , 2022 .

[16]  Ichiro Fujinaga,et al.  jMIR: Tools for Automatic Music Classification , 2009, ICMC.

[17]  Andreas Rauber,et al.  On the suitability of state-of-the-art music information retrieval methods for analyzing, categorizing and accessing non-Western and ethnic music collections , 2010, Signal Process..

[18]  Elias Pampalk,et al.  Content-based organization and visualization of music archives , 2002, MULTIMEDIA '02.

[19]  Andreas Rauber,et al.  A Cartesian Ensemble of Feature Subspace Classifiers for Music Categorization , 2010, ISMIR.

[20]  Daniel P. W. Ellis,et al.  Classifying Music Audio with Timbral and Chroma Features , 2007, ISMIR.

[21]  Ian H. Witten,et al.  Weka: Practical machine learning tools and techniques with Java implementations , 1999 .