Emotion recognition in human computer interaction using multiple classifier systems

Initial work on automatic emotion recognition concentrates mainly on audio-based emotion classification. Speech is the most important channel for the communication between humans and it may expected that emotional states are trans-fered though content, prosody or paralinguistic cues. Besides the audio modality, with the rapidly developing computer hardware and video-processing devices researches start exploring the video modality. Visual-based emotion recognition works focus mainly on the extraction and recognition of emotional information from the facial expressions. There are also attempts to classify emotional states from body or head gestures and to combine different visual modalities, for instance facial expressions and body gesture captured by two separate cameras [3]. Emotion recognition from psycho-physiological measurements, such as skin conductance, respiration, electro-cardiogram (ECG), electromyography (EMG), electroencephalography (EEG) is another attempt. In contrast to speech, gestures or facial expressions these biopotentials are the result of the autonomic nervous system and cannot be imitated [4]. Research activities in facial expression and speech based emotion recognition [6] are usually performed independently from each other. But in almost all practical applications people speak and exhibit facial expressions at the same time, and consequently both modalities should be used in order to perform robust affect recognition. Therefore, multimodal, and in particularly audiovisual emotion recognition has been emerging in recent times [11], for example multiple classifier systems have been widely investigated for the classification of human emotions [1, 9, 12, 14]. Combining classifiers is a promising approach to improve the overall classifier performance [13, 8]. In multiple classifier systems (MCS) it is assumed that the raw data X originates from an underlying source, but each classifier receives different subsets of (X) of the same raw input data X. Feature vector F j (X) are used as the input to the j−th classifier computing an estimate y j of the class membership of F j (X). This output y j might be a crisp class label or a vector of class memberships, e.g. estimates of posteriori probabilities. Based on the multiple classifier outputs y 1 ,. .. , y N the combiner produces the final decision y. Combiners used in this study are fixed transformations of the multiple classifier outputs y 1 ,. .. , y N. Examples of such combining rules are Voting, (weighted) Averaging, and Multiplying, just to mention the most popular types. 2 Friedhelm Schwenker In addition to a priori fixed combination rules the combiner can be a …

[1]  Friedhelm Schwenker,et al.  Multiple Classifier Systems for the Recogonition of Human Emotions , 2010, MCS.

[2]  Günther Palm,et al.  On the discovery of events in EEG data utilizing information fusion , 2013, Comput. Stat..

[3]  Hatice Gunes,et al.  Bi-modal emotion recognition from expressive face and body gestures , 2007, J. Netw. Comput. Appl..

[4]  Lori Lamel,et al.  Challenges in real-life emotion annotation and machine learning based detection , 2005, Neural Networks.

[5]  Günther Palm,et al.  The GMM-SVM Supervector Approach for the Recognition of the Emotional Status from Speech , 2009, ICANN.

[6]  K. H. Kim,et al.  Emotion recognition system using short-term monitoring of physiological signals , 2004, Medical and Biological Engineering and Computing.

[7]  Günther Palm,et al.  Spotting laughter in natural multiparty conversations: A comparison of automatic online and offline approaches using audiovisual data , 2012, TIIS.

[8]  Markus Kächele,et al.  Multiple Classifier Systems for the Classification of Audio-Visual Emotional States , 2011, ACII.

[9]  Björn W. Schuller,et al.  AVEC 2011-The First International Audio/Visual Emotion Challenge , 2011, ACII.

[10]  Subhash C. Bagui,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.

[11]  Friedhelm Schwenker,et al.  A Multiple Classifier System Approach for Facial Expressions in Image Sequences Utilizing GMM Supervectors , 2010, 2010 20th International Conference on Pattern Recognition.

[12]  Pierre-Yves Oudeyer,et al.  The production and recognition of emotions in speech: features and algorithms , 2003, Int. J. Hum. Comput. Stud..

[13]  Friedhelm Schwenker,et al.  Multimodal Emotion Classification in Naturalistic User Behavior , 2011, HCI.

[14]  G. Palm,et al.  Classifier fusion for emotion recognition from speech , 2007 .