Multimodal Music Mood Classification Using Audio and Lyrics

In this paper we present a study on music mood classification using audio and lyrics information. The mood of a song is expressed by means of musical features but a relevant part also seems to be conveyed by the lyrics. We evaluate each factor independently and explore the possibility to combine both, using natural language processing and music information retrieval techniques. We show that standard distance-based methods and latent semantic analysis are able to classify the lyrics significantly better than random, but the performance is still quite inferior to that of audio-based techniques. We then introduce a method based on differences between language models that gives performances closer to audio-based classifiers. Moreover, integrating this in a multimodal system (audio+text) allows an improvement in the overall performance. We demonstrate that lyrics and audio information are complementary, and can be combined to improve a classification system.

[1]  Cecilia Ovesdotter Alm,et al.  Emotions from Text: Machine Learning for Text-based Emotion Prediction , 2005, HLT.

[2]  Andreas Rauber,et al.  Integration of Text and Audio Features for Genre Classification in Music Information Retrieval , 2007, ECIR.

[3]  Diana Deutsch,et al.  Music perception. , 2007, Frontiers in bioscience : a journal and virtual library.

[4]  Michael I. Mandel,et al.  AUDIO MUSIC MOOD CLASSIFICATION USING SUPPORT VECTOR MACHINE , 2007 .

[5]  J. Stephen Downie,et al.  The Music Information Retrieval Evaluation eXchange (MIREX) , 2006 .

[6]  P. Laukka,et al.  Expression, Perception, and Induction of Musical Emotions: A Review and a Questionnaire Study of Everyday Listening , 2004 .

[7]  George Tzanetakis,et al.  MARSYAS: a framework for audio analysis , 1999, Organised Sound.

[8]  Giacomo Mauro DAriano The Journal of Personality and Social Psychology. , 2002 .

[9]  Janto Skowronek,et al.  A Demonstrator for Automatic Music Mood Estimation , 2007, ISMIR.

[10]  I. Peretz,et al.  Singing in the Brain: Independence of Lyrics and Tunes , 1998 .

[11]  Petri Toiviainen,et al.  MIR in Matlab (II): A Toolbox for Musical Feature Extraction from Audio , 2007, ISMIR.

[12]  Beth Logan,et al.  Semantic analysis of song lyrics , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[13]  Wallace Koehler,et al.  Information science as "Little Science":The implications of a bibliometric analysis of theJournal of the American Society for Information Science , 2001, Scientometrics.

[14]  Daniel P. W. Ellis,et al.  Support vector machine active learning for music retrieval , 2006, Multimedia Systems.

[15]  Kong Joo Lee,et al.  Automatic Affect Recognition Using Natural Language Processing Techniques and Manually Built Affect Lexicon , 2006, IEICE Trans. Inf. Syst..

[16]  Gerard Salton,et al.  The SMART Retrieval System—Experiments in Automatic Document Processing , 1971 .

[17]  Markus Koppenberger,et al.  Natural language processing of lyrics , 2005, ACM Multimedia.