Group-level arousal and valence recognition in static images: Face, body and context

Automatic analysis of affect has become a well-established research area in the last two decades. However, little attention has been paid to analysing the affect expressed by a group of people in a scene or an interaction setting, either in the form of the individual group member's affect or the overall affect expressed collectively. In this paper, we (i) introduce a framework for analysing an image that contains multiple people and recognizing the arousal and valence expressed at the group-level; (ii) present a dataset of images annotated along arousal and valence dimensions; and (iii) extract and evaluate a multitude of face, body and context features. We conduct a set of experiments to classify the overall affect expressed at the group-level along arousal (high, medium, low) and valence (positive, neutral, negative) using k-Nearest Neighbour classifier and integrate the information provided by the face, body and context features using decision level fusion. Our experimental results show the viability of the proposed framework compared to other in-the-wild recognition works - we obtain 54% and 55% recognition accuracy for individual arousal and valence dimensions, respectively.

[1]  Matti Pietikäinen,et al.  A comparative study of texture measures with classification based on featured distributions , 1996, Pattern Recognit..

[2]  Sigal G. Barsade,et al.  Group emotion: A view from top and bottom. , 1998 .

[3]  Eliot R. Smith,et al.  Can emotions be truly group level? Evidence regarding four conceptual criteria. , 2007, Journal of personality and social psychology.

[4]  Andrew Zisserman,et al.  Representing shape with a spatial pyramid kernel , 2007, CIVR '07.

[5]  B. de Gelder,et al.  Body expressions influence recognition of emotions in the face and voice. , 2007, Emotion.

[6]  Hatice Gunes,et al.  Automatic Temporal Segment Detection and Affect Recognition From Face and Body Display , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[7]  Andrew C. Gallagher,et al.  Understanding images of groups of people , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Jean-Claude Martin,et al.  Gesture and emotion: Can basic gestural form features discriminate emotions? , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[9]  Jeffrey F. Cohn Advances in Behavioral Science Using Automated Facial Image Analysis and Synthesis [Social Sciences] , 2010, IEEE Signal Processing Magazine.

[10]  Tamás D. Gedeon,et al.  Static facial expression analysis in tough conditions: Data, evaluation protocol and benchmark , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[11]  Bülent Sankur,et al.  Robust classification of face and head gestures in video , 2011, Image Vis. Comput..

[12]  Dirk Heylen,et al.  Bridging the Gap between Social Animal and Unsocial Machine: A Survey of Social Signal Processing , 2012, IEEE Transactions on Affective Computing.

[13]  Tamás D. Gedeon,et al.  Collecting Large, Richly Annotated Facial-Expression Databases from Movies , 2012, IEEE MultiMedia.

[14]  Roland Göcke,et al.  Finding Happiest Moments in a Social Context , 2012, ACCV.

[15]  Roland Göcke,et al.  Group expression intensity estimation in videos via Gaussian Processes , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[16]  Mohamed Chetouani,et al.  Robust continuous prediction of human emotions using multiscale dynamic cues , 2012, ICMI '12.

[17]  Björn W. Schuller,et al.  Categorical and dimensional affect analysis in continuous input: Current trends and future directions , 2013, Image Vis. Comput..

[18]  Gwen Littlewort,et al.  Multiple kernel learning for emotion recognition in the wild , 2013, ICMI '13.

[19]  Ioannis Patras,et al.  Sieving Regression Forest Votes for Facial Feature Detection in the Wild , 2013, 2013 IEEE International Conference on Computer Vision.

[20]  Fernando De la Torre,et al.  Supervised Descent Method and Its Applications to Face Alignment , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Andrea Cavallaro,et al.  Local Zernike Moment Representation for Facial Affect Recognition , 2013, BMVC.

[22]  Xiangjian He,et al.  Hierarchical affective content analysis in arousal and valence dimensions , 2013, Signal Process..

[23]  Abhinav Dhall,et al.  Emotion recognition in the wild challenge 2013 , 2013, ICMI '13.

[24]  Andrea Kleinsmith,et al.  Affective Body Expression Perception and Recognition: A Survey , 2013, IEEE Transactions on Affective Computing.

[25]  Razvan Pascanu,et al.  Combining modality specific deep neural networks for emotion recognition in video , 2013, ICMI '13.

[26]  Andrew Zisserman,et al.  Talking Heads: Detecting Humans and Recognizing Their Interactions , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Tamás D. Gedeon,et al.  Emotion Recognition In The Wild Challenge 2014: Baseline, Data and Protocol , 2014, ICMI.

[28]  Andrea Cavallaro,et al.  Automatic Analysis of Facial Affect: A Survey of Registration, Representation, and Recognition , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Nicu Sebe,et al.  The more the merrier: Analysing the affect of a group of people in images , 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).