Evaluation of Audio Feature Groups for the Prediction of Arousal and Valence in Music

Computer-aided prediction of arousal and valence ratings helps to automatically associate emotions with music pieces, providing new music categorisation and recommendation approaches, and also theoretical analysis of listening habits. The impact of several groups of music properties like timbre, harmony, melody or rhythm on perceived emotions has often been studied in literature. However, only little work has been done to extensively measure the potential of specific feature groups, when they supplement combinations of other possible features already integrated into the regression model. In our experiment, we measure the performance of multiple linear regression applied to combinations of energy, harmony, rhythm and timbre audio features to predict arousal and valence ratings. Each group is represented by a smaller number of dimensions estimated with the help of Minimum Redundancy–Maximum Relevance (MRMR) feature selection. The results show that cepstral timbre features are particularly useful to predict arousal, and rhythm features are the most relevant to predict valence.

[1]  K. Hevner Tests for esthetic appreciation in the field of music. , 1930 .

[2]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[3]  J. Russell A circumplex model of affect. , 1980 .

[4]  K. R. Scherer Vokale Kommunikation : nonverbale Aspekte des Sprachverhaltens , 1982 .

[5]  Seiji Inokuchi,et al.  Sentiment extraction in music , 1988, [1988 Proceedings] 9th International Conference on Pattern Recognition.

[6]  Takuya Fujishima,et al.  Realtime Chord Recognition of Musical Sound: a System Using Common Lisp Music , 1999, ICMC.

[7]  D. Watson,et al.  On the Dimensional and Hierarchical Structure of Affect , 1999 .

[8]  Lie Lu,et al.  Music type classification by spectral contrast feature , 2002, Proceedings. IEEE International Conference on Multimedia and Expo.

[9]  Jeroen Breebaart,et al.  Features for audio and music classification , 2003, ISMIR.

[10]  Emery Schubert Modeling Perceived Emotion With Continuous Musical Features , 2004 .

[11]  Katharina Morik,et al.  Automatic Feature Extraction for Classifying Audio Data , 2005, Machine Learning.

[12]  Chris H. Q. Ding,et al.  Minimum Redundancy Feature Selection from Microarray Gene Expression Data , 2005, J. Bioinform. Comput. Biol..

[13]  Rainer Martin,et al.  Cepstral modulation ratio regression (CMRARE) parameters for audio signal analysis and classification , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[14]  Juan Pablo Bello,et al.  Automated Music Emotion Recognition: A Systematic Evaluation , 2010 .

[15]  Simon Dixon,et al.  Approximate Note Transcription for the Improved Identification of Difficult Chords , 2010, ISMIR.

[16]  Igor Vatolkin,et al.  AMUSE (Advanced MUSic Explorer) - A Multitool Framework for Music Data Analysis , 2010, ISMIR.

[17]  Tuomas Eerola,et al.  Generalizability and Simplicity as Criteria in Feature Selection: Application to Mood Classification in Music , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[18]  Meinard Müller,et al.  Chroma Toolbox: Matlab Implementations for Extracting Variants of Chroma-Based Audio Features , 2011, ISMIR.

[19]  Homer H. Chen,et al.  Music Emotion Recognition , 2011 .

[20]  Youngmoo E. Kim,et al.  Learning emotion-based acoustic features with deep belief networks , 2011, 2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).

[21]  Yi-Hsuan Yang,et al.  1000 songs for emotional analysis of music , 2013, CrowdMM '13.

[22]  Rui Pedro Paiva,et al.  Dimensional Music Emotion Recognition: Combining Standard and Melodic Audio Features , 2014 .

[23]  Colin Raffel,et al.  librosa: Audio and Music Signal Analysis in Python , 2015, SciPy.

[24]  Tuomas Virtanen,et al.  Stacked Convolutional and Recurrent Neural Networks for Music Emotion Recognition , 2017, ArXiv.

[25]  Juan Li,et al.  Review of data features-based music emotion recognition methods , 2017, Multimedia Systems.

[26]  Jacek Grekow Audio features dedicated to the detection and tracking of arousal and valence in musical compositions , 2018, J. Inf. Telecommun..

[27]  Günter Rudolph,et al.  Comparison of Audio Features for Recognition of Western and Ethnic Instruments in Polyphonic Mixtures , 2018, ISMIR.

[28]  Rui Pedro Paiva,et al.  Novel Audio Features for Music Emotion Recognition , 2020, IEEE Transactions on Affective Computing.