Music emotion classification and context-based music recommendation

Context-based music recommendation is one of rapidly emerging applications in the advent of ubiquitous era and requires multidisciplinary efforts including low level feature extraction and music classification, human emotion description and prediction, ontology-based representation and recommendation, and the establishment of connections among them. In this paper, we contributed in three distinctive ways to take into account the idea of context awareness in the music recommendation field. Firstly, we propose a novel emotion state transition model (ESTM) to model human emotional states and their transitions by music. ESTM acts like a bridge between user situation information along with his/her emotion and low-level music features. With ESTM, we can recommend the most appropriate music to the user for transiting to the desired emotional state. Secondly, we present context-based music recommendation (COMUS) ontology for modeling user’s musical preferences and context, and for supporting reasoning about the user’s desired emotion and preferences. The COMUS is music-dedicated ontology in OWL constructed by incorporating domain-specific classes for music recommendation into the Music Ontology, which includes situation, mood, and musical features. Thirdly, for mapping low-level features to ESTM, we collected various high-dimensional music feature data and applied nonnegative matrix factorization (NMF) for their dimension reduction. We also used support vector machine (SVM) as emotional state transition classifier. We constructed a prototype music recommendation system based on these features and carried out various experiments to measure its performance. We report some of the experimental results.

[1]  R. Thayer The biopsychology of mood and arousal , 1989 .

[2]  Òscar Celma,et al.  Foafing the Music: Bridging the Semantic Gap in Music Recommendation , 2006, SEMWEB.

[3]  Lie Lu,et al.  Automatic mood detection and tracking of music audio signals , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[4]  Yannis Stylianou,et al.  Musical Genre Classification Using Nonnegative Matrix Factorization-Based Features , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[5]  Daniel P. W. Ellis,et al.  Identifying `Cover Songs' with Chroma Features and Dynamic Programming Beat Tracking , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[6]  日本規格協会 Acoustics : normal equal-loudness-level contours = 音響 : 正常な音の大きさの等感曲線 , 2004 .

[7]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[8]  Yueting Zhuang,et al.  Music information retrieval by detecting mood via computational media aesthetics , 2003, Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003).

[9]  Xavier Serra,et al.  A Multimodal Approach to Bridge the Music Semantic Gap , 2006, SAMT.

[10]  Anssi Klapuri,et al.  Signal Processing Methods for Music Transcription , 2006 .

[11]  Meinard Müller,et al.  Information retrieval for music and motion , 2007 .

[12]  K. Scherer What are emotions? And how can they be measured? , 2005 .

[13]  J. Russell A circumplex model of affect. , 1980 .

[14]  François Pachet,et al.  A taxonomy of musical genres , 2000, RIAO.

[15]  F. Ren,et al.  An experimentation on creating a mental state transition network , 2005, 2005 IEEE International Conference on Information Acquisition.

[16]  William M. Hartmann,et al.  Psychoacoustics: Facts and Models , 2001 .

[17]  Klaus R. Scherer,et al.  What does facial expression express , 1992 .

[18]  Yueting Zhuang,et al.  Popular music retrieval by detecting mood , 2003, SIGIR.

[19]  Seungmin Rho,et al.  M-MUSICS: mobile content-based music retrieval system , 2007, ACM Multimedia.

[20]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[21]  Seungmin Rho,et al.  MUSEMBLE: A novel music retrieval system with automatic voice query transcription and reformulation , 2008, J. Syst. Softw..

[22]  D. W. Robinson,et al.  A re-determination of the equal-loudness relations for pure tones , 1956 .

[23]  Yoram Singer,et al.  Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers , 2000, J. Mach. Learn. Res..

[24]  Mark B. Sandler,et al.  The Music Ontology , 2007, ISMIR.

[25]  Hugo Fastl,et al.  Psychoacoustics: Facts and Models , 1990 .

[26]  P. Juslin,et al.  Cue Utilization in Communication of Emotion in Music Performance: Relating Performance to Perception Studies of Music Performance , 2022 .

[27]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[28]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[29]  Sanghoon Jun,et al.  A fuzzy inference-based music emotion recognition system , 2008 .

[30]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[31]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[32]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.