State of the Art Report: Music Emotion Recognition: A State of the Art Review

This paper surveys the state of the art in automatic emotion recognition in music. Music is oftentimes referred to as a “language of emotion” [1], and it is natural for us to categorize music in terms of its emotional associations. Myriad features, such as harmony, timbre, interpretation, and lyrics affect emotion, and the mood of a piece may also change over its duration. But in developing automated systems to organize music in terms of emotional content, we are faced with a problem that oftentimes lacks a welldefined answer; there may be considerable disagreement regarding the perception and interpretation of the emotions of a song or ambiguity within the piece itself. When compared to other music information retrieval tasks (e.g., genre identification), the identification of musical mood is still in its early stages, though it has received increasing attention in recent years. In this paper we explore a wide range of research in music emotion recognition, particularly focusing on methods that use contextual text information (e.g., websites, tags, and lyrics) and content-based approaches, as well as systems combining multiple feature domains.

[1]  Mark B. Sandler,et al.  A Semantic Space for Music Derived from Social Tags , 2007, ISMIR.

[2]  Youngmoo E. Kim,et al.  MoodSwings: A Collaborative Game for Music Mood Label Collection , 2008, ISMIR.

[3]  Daniel P. W. Ellis,et al.  Support vector machine active learning for music retrieval , 2006, Multimedia Systems.

[4]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[5]  Lie Lu,et al.  Music type classification by spectral contrast feature , 2002, Proceedings. IEEE International Conference on Multimedia and Expo.

[6]  Björn W. Schuller,et al.  Determination of Nonprototypical Valence and Arousal in Popular Music: Features and Performances , 2010, EURASIP J. Audio Speech Music. Process..

[7]  George Tzanetakis,et al.  Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..

[8]  Takuya Fujishima,et al.  Realtime Chord Recognition of Musical Sound: a System Using Common Lisp Music , 1999, ICMC.

[9]  Gert R. G. Lanckriet,et al.  Combining audio content and social context for semantic music discovery , 2009, SIGIR.

[10]  Daniel P. W. Ellis,et al.  Automatic Record Reviews , 2004, ISMIR.

[11]  M. Bangert,et al.  Emotion in Motion: Investigating the Time-Course of Emotional Judgments of Musical Stimuli , 2009 .

[12]  George Tzanetakis,et al.  MARSYAS SUBMISSIONS TO MIREX 2007 , 2007 .

[13]  François Pachet,et al.  Improving Timbre Similarity : How high’s the sky ? , 2004 .

[14]  Dan Yang,et al.  Disambiguating Music Emotion Using Software Agents , 2004, ISMIR.

[15]  Youngmoo E. Kim,et al.  Prediction of Time-varying Musical Mood Distributions from Audio , 2010, ISMIR.

[16]  Ming Li,et al.  THINKIT'S SUBMISSIONS FOR MIREX2009 AUDIO MUSIC CLASSIFICATION AND SIMILARITY TASKS , 2009 .

[17]  I. Peretz,et al.  Universal Recognition of Three Basic Emotions in Music , 2009, Current Biology.

[18]  Youngmoo E. Kim,et al.  Feature selection for content-based, time-varying musical emotion regression , 2010, MIR '10.

[19]  Wolfgang Nejdl,et al.  Music Mood and Theme Classification - a Hybrid Approach , 2009, ISMIR.

[20]  Beth Logan,et al.  Mel Frequency Cepstral Coefficients for Music Modeling , 2000, ISMIR.

[21]  George Tzanetakis,et al.  MARSYAS: a framework for audio analysis , 1999, Organised Sound.

[22]  Janto Skowronek,et al.  A Demonstrator for Automatic Music Mood Estimation , 2007, ISMIR.

[23]  Emery Schubert,et al.  The Perception of Emotion in Music , 2012 .

[24]  O. Meyers A mood-based music classification and exploration system , 2007 .

[25]  Densil Cabrera,et al.  'Psysound3': Software for Acoustical and Psychoacoustical Analysis of Sound Recordings , 2007 .

[26]  Giovanni De Poli,et al.  Score-Independent Audio Features for Description of Music Expression , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[27]  Gert R. G. Lanckriet,et al.  Semantic Annotation and Retrieval of Music and Sound Effects , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[28]  Zhihong Zeng,et al.  A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Jens Grivolla,et al.  Multimodal Music Mood Classification Using Audio and Lyrics , 2008, 2008 Seventh International Conference on Machine Learning and Applications.

[30]  Lie Lu,et al.  Automatic mood detection and tracking of music audio signals , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[31]  Wolfgang Nejdl,et al.  How do you feel about "dancing queen"?: deriving mood & theme annotations from user tags , 2009, JCDL '09.

[32]  Peter Knees,et al.  A music search engine built upon audio-based and web-based similarity measures , 2007, SIGIR.

[33]  Gary S. Katz,et al.  Bimodal expression of emotion by face and voice , 1998, MULTIMEDIA '98.

[34]  K. MacDorman,et al.  Automatic Emotion Prediction of Song Excerpts: Index Construction, Algorithm Design, and Empirical Comparison , 2007 .

[35]  P. Laukka,et al.  Expression, Perception, and Induction of Musical Emotions: A Review and a Questionnaire Study of Everyday Listening , 2004 .

[36]  Andreas F. Ehmann,et al.  Lyric Text Mining in Music Mood Classification , 2009, ISMIR.

[37]  Luis von Ahn Games with a Purpose , 2006, Computer.

[38]  Durga L. Shrestha,et al.  Experiments with AdaBoost.RT, an Improved Boosting Scheme for Regression , 2006, Neural Computation.

[39]  J. Sloboda,et al.  Music and emotion: Theory and research , 2001 .

[40]  K. Hevner Experimental studies of the elements of expression in music , 1936 .

[41]  Beth Logan,et al.  Semantic analysis of song lyrics , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[42]  Brandon G. Morton,et al.  Improving music emotion labeling using human computation , 2010, HCOMP '10.

[43]  J. Russell,et al.  An approach to environmental psychology , 1974 .

[44]  K. Scherer,et al.  Emotions evoked by the sound of music: characterization, classification, and measurement. , 2008, Emotion.

[45]  Yajie Hu,et al.  Lyric-based Song Emotion Detection with Affective Lexicon and Fuzzy Clustering Method , 2009, ISMIR.

[46]  J. Marozeau,et al.  Multidimensional scaling of emotional responses to music: The effect of musical expertise and of the duration of the excerpts , 2005 .

[47]  Mert Bay,et al.  The Music Information Retrieval Evaluation eXchange: Some Observations and Insights , 2010, Advances in Music Information Retrieval.

[48]  J. Stephen Downie,et al.  The music information retrieval evaluation exchange (2005-2007): A window into music information retrieval research , 2008 .

[49]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[50]  D. Watson,et al.  On the Dimensional and Hierarchical Structure of Affect , 1999 .

[51]  Gert R. G. Lanckriet,et al.  User-centered design of a social game to tag music , 2009, HCOMP '09.

[52]  Peter Knees,et al.  Artist Classification with Web-Based Data , 2004, ISMIR.

[53]  Paris Smaragdis,et al.  Combining Musical and Cultural Features for Intelligent Style Detection , 2002, ISMIR.

[54]  Peter Knees,et al.  A Document-Centered Approach to a Natural Language Music Search Engine , 2008, ECIR.

[55]  Petri Toiviainen,et al.  Prediction of Multidimensional Emotional Ratings in Music from Audio Using Multivariate Regression Models , 2009, ISMIR.

[56]  Rainer Reisenzein,et al.  Experiencing activation: energetic arousal and tense arousal are not mixtures of valence and activation. , 2002, Emotion.

[57]  Andreas Rauber,et al.  Integration of Text and Audio Features for Genre Classification in Music Information Retrieval , 2007, ECIR.

[58]  M. Bradley,et al.  Affective Norms for English Words (ANEW): Instruction Manual and Affective Ratings , 1999 .

[59]  Benoit Huet,et al.  Bimodal Emotion Recognition , 2010, ICSR.

[60]  Daniel P. W. Ellis,et al.  Please Scroll down for Article Journal of New Music Research a Web-based Game for Collecting Music Metadata a Web-based Game for Collecting Music Metadata , 2022 .

[61]  Emery Schubert Update of the Hevner Adjective Checklist , 2003, Perceptual and motor skills.

[62]  C. C. Pratt,et al.  Music as a Language of Emotion , 1948 .

[63]  Stefanie Nowak,et al.  Content-based mood classification for photos and music: a generic multi-modal classification framework and evaluation approach , 2008, MIR '08.

[64]  Roger B. Dannenberg,et al.  TagATune: A Game for Music and Sound Annotation , 2007, ISMIR.

[65]  Mert Bay,et al.  The 2007 MIREX Audio Mood Classification Task: Lessons Learned , 2008, ISMIR.

[66]  Douglas Turnbull,et al.  Exploring "Artist Image" Using Content-Based Analysis Of Promotional Photos , 2010, ICMC.

[67]  Tao Li,et al.  Detecting emotion in music , 2003, ISMIR.

[68]  G. Peeters,et al.  A Generic Training and Classification System for MIREX08 Classification Tasks: Audio Music Mood, Audio Genre, Audio Artist and Audio Tag , 2008 .

[69]  Gert R. G. Lanckriet,et al.  A Game-Based Approach for Collecting Semantic Annotations of Music , 2007, ISMIR.

[70]  Òscar Celma,et al.  Search Sounds: An audio crawler focused on weblogs , 2006, ISMIR.

[71]  Seungmin Rho,et al.  SMERS: Music Emotion Recognition Using Support Vector Regression , 2009, ISMIR.