Evaluation of Musical Features for Emotion Classification

Because music conveys and evokes feelings, a wealth of research has been performed on music emotion recognition. Previous research has shown that musical mood is linked to features based on rhythm, timbre, spectrum and lyrics. For example, sad music correlates with slow tempo, while happy music is generally faster. However, only limited success has been obtained in learning automatic classifiers of emotion in music. In this paper, we collect a ground truth data set of 2904 songs that have been tagged with one of the four words “happy”, “sad”, “angry” and “relaxed”, on the Last.FM web site. An excerpt of the audio is then retrieved from 7Digital.com, and various sets of audio features are extracted using standard algorithms. Two classifiers are trained using support vector machines with the polynomial and radial basis function kernels, and these are tested with 10-fold cross validation. Our results show that spectral features outperform those based on rhythm, dynamics, and, to a lesser extent, harmony. We also find that the polynomial kernel gives better results than the radial basis function, and that the fusion of different feature sets does not always lead to improved classification.

[1]  Markus Koppenberger,et al.  Content-based music audio recommendation , 2005, ACM Multimedia.

[2]  E Tsunoo,et al.  Beyond Timbral Statistics: Improving Music Classification Using Percussive Patterns and Bass Lines , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[3]  Tuomas Eerola,et al.  Generalizability and Simplicity as Criteria in Feature Selection: Application to Mood Classification in Music , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[4]  George Tzanetakis,et al.  Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..

[5]  Youngmoo E. Kim,et al.  Feature selection for content-based, time-varying musical emotion regression , 2010, MIR '10.

[6]  Donald S. Williamson,et al.  Towards Quantifying the "Album Effect" in Artist Identification , 2006, ISMIR.

[7]  T. Eerola,et al.  A comparison of the discrete and dimensional models of emotion in music , 2011 .

[8]  Òscar Celma,et al.  Foafing the Music: Bridging the Semantic Gap in Music Recommendation , 2006, SEMWEB.

[9]  Jeffrey J. Scott,et al.  MUSIC EMOTION RECOGNITION: A STATE OF THE ART REVIEW , 2010 .

[10]  Gert R. G. Lanckriet,et al.  Five Approaches to Collecting Tags for Music , 2008, ISMIR.

[11]  Thierry Pun,et al.  DEAP: A Database for Emotion Analysis ;Using Physiological Signals , 2012, IEEE Transactions on Affective Computing.

[12]  G. A. Mendelsohn,et al.  Affect grid : A single-item scale of pleasure and arousal , 1989 .

[13]  Mark B. Sandler,et al.  A Semantic Space for Music Derived from Social Tags , 2007, ISMIR.

[14]  François Pachet,et al.  Analytical Features: A Knowledge-Based Approach to Audio Feature Generation , 2009, EURASIP J. Audio Speech Music. Process..

[15]  Tao Li,et al.  Are Tags Better Than Audio? The Effect of Joint Use of Tags and Audio Content Features for Artistic Style Clustering , 2010, ISMIR.

[16]  T. Eerola Are the Emotions Expressed in Music Genre-specific? An Audio-based Evaluation of Datasets Spanning Classical, Film, Pop and Mixed Genres , 2011 .

[17]  Giovanni De Poli,et al.  Score-Independent Audio Features for Description of Music Expression , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[18]  K. MacDorman,et al.  Automatic Emotion Prediction of Song Excerpts: Index Construction, Algorithm Design, and Empirical Comparison , 2007 .

[19]  Francis F. Li,et al.  Music Mood Classification of Television Theme Tunes , 2011, ISMIR.

[20]  P. Juslin,et al.  Play it again with feeling: computer feedback in musical communication of emotions. , 2006, Journal of experimental psychology. Applied.

[21]  Jens Grivolla,et al.  Multimodal Music Mood Classification Using Audio and Lyrics , 2008, 2008 Seventh International Conference on Machine Learning and Applications.

[22]  Paul Lamere,et al.  Social Tagging and Music Information Retrieval , 2008 .

[23]  Yueting Zhuang,et al.  Popular music retrieval by detecting mood , 2003, SIGIR.

[24]  Perfecto Herrera,et al.  Audio music mood classification using support vector machine , 2007 .

[25]  Mert Bay,et al.  Creating a Simplified Music Mood Classification Ground-Truth Set , 2007, ISMIR.

[26]  Petri Toiviainen,et al.  MIR in Matlab (II): A Toolbox for Musical Feature Extraction from Audio , 2007, ISMIR.

[27]  C. Krumhansl Music: A Link Between Cognition and Emotion , 2002 .

[28]  Andreas F. Ehmann,et al.  Lyric Text Mining in Music Mood Classification , 2009, ISMIR.

[29]  Dan Yang,et al.  Disambiguating Music Emotion Using Software Agents , 2004, ISMIR.

[30]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[31]  K. Scherer,et al.  Emotions evoked by the sound of music: characterization, classification, and measurement. , 2008, Emotion.

[32]  K. Hevner Experimental studies of the elements of expression in music , 1936 .

[33]  Zhouyu Fu,et al.  A Survey of Audio-Based Music Classification and Annotation , 2011, IEEE Transactions on Multimedia.

[34]  Wolfgang Nejdl,et al.  Music Mood and Theme Classification - a Hybrid Approach , 2009, ISMIR.

[35]  Terence Magno,et al.  A Comparison of Signal Based Music Recommendation to Genre Labels, Collaborative Filtering, Musicological Analysis, Human Recommendation and Random Baseline , 2008, ISMIR.

[36]  J. Stephen Downie,et al.  Exploring Mood Metadata: Relationships with Genre, Artist and Usage Metadata , 2007, ISMIR.

[37]  Daniel P. W. Ellis,et al.  Song-Level Features and Support Vector Machines for Music Classification , 2005, ISMIR.

[38]  Mert Bay,et al.  The 2007 MIREX Audio Mood Classification Task: Lessons Learned , 2008, ISMIR.

[39]  Petri Toiviainen,et al.  Prediction of Multidimensional Emotional Ratings in Music from Audio Using Multivariate Regression Models , 2009, ISMIR.

[40]  Pedro J. Moreno,et al.  A Study of Musical Instrument Classification Using Gaussian Mixture Models and Support Vector Machines , 1999 .

[41]  Björn W. Schuller,et al.  Determination of Nonprototypical Valence and Arousal in Popular Music: Features and Performances , 2010, EURASIP J. Audio Speech Music. Process..