Protocol and baseline for experiments on Bogazici University Turkish emotional speech corpus

This study aims at presenting an emotional corpus collected at Boğaziçi University / Electrical and Electronics Department, on which no previous signal processing and machine learning study was done for classification purposes. It also aims at providing the protocol for further experiments on this corpus. The emotional corpus consists of 484 speech utterances from 11 amateur actors acting 11 emotionally undefined sentences with 4 emotions using Stanislavsky effect. In-line with the state-of-the-art method in the field, functionals were passed over the Low Level Descriptors of the signal to obtain fixed length feature vectors. For this purpose, the openSMILE feature extractor was used with the baseline feature set from the INTERSPEECH 2013 Computational Paralinguistics Challenge. The training, validation and testing partitions are defined on the corpus. The baseline results obtained using Support Vector Machines and Random Forests are presented.

[1]  Björn W. Schuller,et al.  AVEC 2013: the continuous audio/visual emotion and depression recognition challenge , 2013, AVEC@ACM Multimedia.

[2]  Björn W. Schuller,et al.  AVEC 2012: the continuous audio/visual emotion challenge , 2012, ICMI '12.

[3]  Björn W. Schuller,et al.  Affect recognition in real-life acoustic conditions - a new perspective on feature selection , 2013, INTERSPEECH.

[4]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[5]  Björn W. Schuller,et al.  The INTERSPEECH 2009 emotion challenge , 2009, INTERSPEECH.

[6]  Albert Ali Salah,et al.  Computer Analysis of Human Behavior , 2011 .

[7]  Fabio Valente,et al.  The INTERSPEECH 2013 computational paralinguistics challenge: social signals, conflict, emotion, autism , 2013, INTERSPEECH.

[8]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[9]  Björn Schuller,et al.  Opensmile: the munich versatile and fast open-source audio feature extractor , 2010, ACM Multimedia.

[10]  Elmar Nöth,et al.  The INTERSPEECH 2012 Speaker Trait Challenge , 2012, INTERSPEECH.

[11]  Serdar Yildirim,et al.  Turkish emotional speech database , 2011, 2011 IEEE 19th Signal Processing and Communications Applications Conference (SIU).

[12]  Alessandro Vinciarelli,et al.  Automatic personality perception: Prediction of trait attribution based on prosodic features extended abstract , 2015, 2015 International Conference on Affective Computing and Intelligent Interaction (ACII).

[13]  Björn W. Schuller,et al.  Voice and Speech Analysis in Search of States and Traits , 2011, Computer Analysis of Human Behavior.

[14]  Ethem Alpaydin,et al.  Introduction to machine learning , 2004, Adaptive computation and machine learning.