论文信息 - Two-level Explanations in Music Emotion Recognition

Two-level Explanations in Music Emotion Recognition

Current ML models for music emotion recognition, while generally working quite well, do not give meaningful or intuitive explanations for their predictions. In this work, we propose a 2-step procedure to arrive at spectrogram-level explanations that connect certain aspects of the audio to interpretable mid-level perceptual features, and these to the actual emotion prediction. That makes it possible to focus on specific musical reasons for a prediction (in terms of perceptual features), and to trace these back to patterns in the audio that can be interpreted visually and acoustically.

Gerhard Widmer | Verena Haunschmid | Shreyan Chowdhury

[1] T. Eerola,et al. A comparison of the discrete and dimensional models of emotion in music , 2011 .

[2] Mohammad Soleymani,et al. A Data-driven Approach to Mid-level Perceptual Musical Feature Modeling , 2018, ISMIR.

[3] Bob L. Sturm,et al. Local Interpretable Model-Agnostic Explanations for Music Content Analysis , 2017, ISMIR.

[4] Carlos Guestrin,et al. "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[5] Gerhard Widmer,et al. Towards Explainable Music Emotion Recognition: The Route via Mid-level Features , 2019, ISMIR.