Predicting Perceived Music Emotions with Respect to Instrument Combinations

Music Emotion Recognition has attracted a lot of academic research work in recent years because it has a wide range of applications, including song recommendation and music visualization. As music is a way for humans to express emotion, there is a need for a machine to automatically infer the perceived emotion of pieces of music. In this paper, we compare the accuracy difference between music emotion recognition models given music pieces as a whole versus music pieces separated by instruments. To compare the models' emotion predictions, which are distributions over valence and arousal values, we provide a metric that compares two distribution curves. Using this metric, we provide empirical evidence that training Random Forest and Convolution Recurrent Neural Network with mixed instrumental music data conveys a better understanding of emotion than training the same models with music that are separated into each instrumental source.

[1]  Guoren Wang,et al.  A survey of music emotion recognition , 2022, Frontiers Comput. Sci..

[2]  Thomas J. Faulkenberry,et al.  StatProofBook/StatProofBook.github.io: StatProofBook 2020 , 2020 .

[3]  Zekeriya Tufekci,et al.  Music emotion recognition using convolutional long short term memory deep neural networks , 2020 .

[4]  Jia-Lien Hsu,et al.  Predicting Music Emotion by Using Convolutional Neural Network , 2020, HCI.

[5]  Gerhard Widmer,et al.  Towards Explainable Music Emotion Recognition: The Route via Mid-level Features , 2019, ISMIR.

[6]  Juan Li,et al.  Bidirectional Convolutional Recurrent Sparse Network (BCRSN): An Efficient Model for Music Emotion Recognition , 2019, IEEE Transactions on Multimedia.

[7]  Tara N. Sainath,et al.  Deep Learning for Audio Signal Processing , 2019, IEEE Journal of Selected Topics in Signal Processing.

[8]  Hui Zhang,et al.  The PMEmo Dataset for Music Emotion Recognition , 2018, ICMR.

[9]  Simon Dixon,et al.  Wave-U-Net: A Multi-Scale Neural Network for End-to-End Audio Source Separation , 2018, ISMIR.

[10]  Naresh N. Vempala,et al.  Modeling Music Emotion Judgments Using Machine Learning Methods , 2018, Front. Psychol..

[11]  Fabian-Robert Stöter,et al.  MUSDB18 - a corpus for music separation , 2017 .

[12]  Tuomas Virtanen,et al.  Stacked Convolutional and Recurrent Neural Networks for Music Emotion Recognition , 2017, ArXiv.

[13]  Yan Liu,et al.  CNN based music emotion classification , 2017, ArXiv.

[14]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[15]  Yi-Hsuan Yang,et al.  Developing a benchmark for emotional analysis of music , 2017, PloS one.

[16]  Sylvain Arlot,et al.  Cross-Validation , 2017, Encyclopedia of Machine Learning and Data Mining.

[17]  Sung Wook Baik,et al.  Speech Emotion Recognition from Spectrograms with Deep Convolutional Neural Network , 2017, 2017 International Conference on Platform Technology and Service (PlatCon).

[18]  Eduardo Coutinho,et al.  The INTERSPEECH 2016 Computational Paralinguistics Challenge: Deception, Sincerity & Native Language , 2016, INTERSPEECH.

[19]  Y. Song,et al.  Perceived and Induced Emotion Responses to Popular Music: Categorical and Dimensional Models , 2016 .

[20]  Antoine Liutkus,et al.  Scalable audio separation with light Kernel Additive Modelling , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[21]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[22]  Patrik N. Juslin,et al.  What does music express? Basic emotions and beyond , 2013, Front. Psychol..

[23]  Yi-Hsuan Yang,et al.  Exploiting online music tags for music emotion classification , 2011, TOMCCAP.

[24]  Björn Schuller,et al.  Opensmile: the munich versatile and fast open-source audio feature extractor , 2010, ACM Multimedia.

[25]  M. Kenward,et al.  It's not what you play, it's how you play it: Timbre affects perception of emotion in music , 2009, Quarterly journal of experimental psychology.

[26]  P. Laukka,et al.  Expression, Perception, and Induction of Musical Emotions: A Review and a Questionnaire Study of Everyday Listening , 2004 .

[27]  L. Breiman Random Forests , 2001, Encyclopedia of Machine Learning and Data Mining.

[28]  A. Gabrielsson Emotion perceived and emotion felt: Same or different? , 2001 .

[29]  Yi-Zeng Liang,et al.  Monte Carlo cross validation , 2001 .

[30]  S. Hochreiter,et al.  Long Short-Term Memory , 1997, Neural Computation.

[31]  Huaiyu Zhu On Information and Sufficiency , 1997 .

[32]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[33]  J. Russell A circumplex model of affect. , 1980 .

[34]  Thierry Pun,et al.  DEAP: A Database for Emotion Analysis ;Using Physiological Signals , 2012, IEEE Transactions on Affective Computing.

[35]  C. Izard Emotion theory and research: highlights, unanswered questions, and emerging issues. , 2009, Annual review of psychology.