Theoretical Framework of A Computational Model of Auditory Memory for Music Emotion Recognition

The bag of frames (BOF) approach commonly used in music emotion recognition (MER) has several limitations. The semantic gap is believed to be responsible for the glass ceiling on the performance of BOF MER systems. However, there are hardly any alternative proposals to address it. In this article, we introduce the theoretical framework of a computational model of auditory memory that incorporates temporal information into MER systems. We advocate that the organization of auditory memory places time at the core of the link between musical meaning and musical emotions. The main goal is to motivate MER researchers to develop an improved class of systems capable of overcoming the limitations of the BOF approach and coping with the inherent complexity of musical emotions.

[1]  Jeffrey J. Scott,et al.  MUSIC EMOTION RECOGNITION: A STATE OF THE ART REVIEW , 2010 .

[2]  J. Stephen Downie,et al.  The Impact of MIREX on Scholarly Research (2005 - 2010) , 2012, ISMIR.

[3]  Youngmoo E. Kim,et al.  Modeling Musical Emotion Dynamics with Conditional Random Fields , 2011, ISMIR.

[4]  Yi-Hsuan Yang,et al.  Ranking-Based Emotion Recognition for Music Organization and Retrieval , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[5]  Òscar Celma,et al.  Foafing the Music: Bridging the Semantic Gap in Music Recommendation , 2006, SEMWEB.

[6]  Daniel Västfjäll,et al.  How does music evoke emotions? Exploring the underlying mechanisms. , 2010 .

[7]  Juan Pablo Bello,et al.  Automated Music Emotion Recognition: A Systematic Evaluation , 2010 .

[8]  Lie Lu,et al.  Automatic mood detection and tracking of music audio signals , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[9]  Homer H. Chen,et al.  Music emotion recognition: the role of individuality , 2007, HCM '07.

[10]  K. MacDorman,et al.  Automatic Emotion Prediction of Song Excerpts: Index Construction, Algorithm Design, and Empirical Comparison , 2007 .

[11]  A. Gabrielsson,et al.  The role of structure in the musical expression of emotions , 2010 .

[12]  Youngmoo E. Kim,et al.  Prediction of Time-varying Musical Mood Distributions from Audio , 2010, ISMIR.

[13]  Bob L. Sturm Evaluating music emotion recognition: Lessons from music genre recognition? , 2013, 2013 IEEE International Conference on Multimedia and Expo Workshops (ICMEW).

[14]  François Pachet,et al.  Improving Timbre Similarity : How high’s the sky ? , 2004 .

[15]  B. Snyder Music and Memory: An Introduction , 2001 .

[16]  Gert R. G. Lanckriet,et al.  Modeling Dynamic Patterns for Emotional Content in Music , 2011, ISMIR.

[17]  Geraint A. Wiggins Semantic Gap?? Schemantic Schmap!! Methodological Considerations in the Scientific Study of Music , 2009, 2009 11th IEEE International Symposium on Multimedia.

[18]  Mert Bay,et al.  The 2007 MIREX Audio Mood Classification Task: Lessons Learned , 2008, ISMIR.

[19]  George Tzanetakis,et al.  Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..

[20]  Emery Schubert Modeling Perceived Emotion With Continuous Musical Features , 2004 .

[21]  M.D. Korhonen,et al.  Modeling emotional content of music using system identification , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[22]  Joan Serrà,et al.  Music Mood Representations from Social Tags , 2009, ISMIR.

[23]  T. Eerola Are the Emotions Expressed in Music Genre-specific? An Audio-based Evaluation of Datasets Spanning Classical, Film, Pop and Mixed Genres , 2011 .

[24]  A. Friberg DIGITAL AUDIO EMOTIONS - AN OVERVIEW OF COMPUTER ANALYSIS AND SYNTHESIS OF EMOTIONAL EXPRESSION IN MUSIC , 2008 .

[25]  M. Pearce,et al.  Sweet Anticipation : Music and the Psychology of Expectation , 2007 .

[26]  Athanasios Mouchtaris,et al.  The Role of Time in Music Emotion Recognition: Modeling Musical Emotions from Time-Varying Music Features , 2012, CMMR.

[27]  Youngmoo E. Kim,et al.  Prediction of Time-Varying Musical Mood Distributions Using Kalman Filtering , 2010, 2010 Ninth International Conference on Machine Learning and Applications.

[28]  Emery Schubert Analysis of Emotional Dimensions in Music Using Time Series Techniques , 2006 .

[29]  Tuomas Eerola,et al.  Modeling Listeners' Emotional Response to Music , 2012, Top. Cogn. Sci..

[30]  A. Cangelosi,et al.  Musical emotions: predicting second-by-second subjective feelings of emotion from low-level psychoacoustic features and physiological measurements. , 2011, Emotion.

[31]  K. Scherer Which Emotions Can be Induced by Music? What Are the Underlying Mechanisms? And How Can We Measure Them? , 2004 .

[32]  C. Krumhansl Music: A Link Between Cognition and Emotion , 2002 .

[33]  Eckart Altenmüller,et al.  EMuJoy: Software for continuous measurement of perceived emotions in music , 2007, Behavior research methods.