Towards measuring similarity between emotional corpora

In this paper we suggest feature selection and Principal Component Analysis as a way to analyze and compare corpora of emotional speech. To this end, a fast improvement of the Sequential Forward Floating Search algorithm is introduced, and subsequently extensive tests are run on a selection of French emotional language resources well suited for a first impression on general applicability. Tools for comparing feature-sets are developed to be able to evaluate the results of feature selection in order to obtain conclusions on the corpora or sub-corpora divided by gender.

[1]  Pavel Paclík,et al.  Adaptive floating search methods in feature selection , 1999, Pattern Recognit. Lett..

[2]  Björn W. Schuller,et al.  OpenEAR — Introducing the munich open-source emotion and affect recognition toolkit , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[3]  Ian H. Witten,et al.  Data mining - practical machine learning tools and techniques, Second Edition , 2005, The Morgan Kaufmann series in data management systems.

[4]  Laurence Devillers,et al.  Protocol CINEMO: The use of fiction for collecting emotional data in naturalistic controlled oriented context , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[5]  Björn W. Schuller,et al.  The INTERSPEECH 2009 emotion challenge , 2009, INTERSPEECH.

[6]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[7]  Josef Kittler,et al.  Fast branch & bound algorithms for optimal feature selection , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Mátyás Brendel,et al.  Building a System for Emotions Detection from Speech to Control an Affective Avatar , 2010, LREC.

[9]  Josef Kittler,et al.  Floating search methods in feature selection , 1994, Pattern Recognit. Lett..

[10]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[11]  Björn W. Schuller,et al.  CINEMO - A French Spoken Language Resource for Complex Emotions: Facts and Baselines , 2010, LREC.

[12]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[13]  L. Devillers,et al.  Acoustic measures characterizing anger across corpora collected in artificial or natural context , 2010 .