The more the merrier: Analysing the affect of a group of people in images

The recent advancement of social media has given users a platform to socially engage and interact with a global population. With millions of images being uploaded onto social media platforms, there is an increasing interest in inferring the emotion and mood display of a group of people in images. Automatic affect analysis research has come a long way but has traditionally focussed on a single subject in a scene. In this paper, we study the problem of inferring the emotion of a group of people in an image. This group affect has wide applications in retrieval, advertisement, content recommendation and security. The contributions of the paper are: 1) a novel emotion labelled database of groups of people in images; 2) a Multiple Kernel Learning based hybrid affect inference model; 3) a scene context based affect inference model; 4) a user survey to better understand the attributes that affect the perception of affect of a group of people in an image. The detailed experimentation validation provides a rich baseline for the proposed database.

[1]  Ville Ojansivu,et al.  Blur Insensitive Texture Classification Using Local Phase Quantization , 2008, ICISP.

[2]  Gwen Littlewort,et al.  A discriminative parts based model approach for fiducial points free and shape constrained head pose normalisation in the wild , 2014, IEEE Winter Conference on Applications of Computer Vision.

[3]  Tamás D. Gedeon,et al.  Emotion recognition using PHOG and LPQ features , 2011, Face and Gesture 2011.

[4]  Daniel Gatica-Perez,et al.  Automatic nonverbal analysis of social interaction in small groups: A review , 2009, Image Vis. Comput..

[5]  Gang Wang,et al.  Seeing People in Social Context: Recognizing People and Social Relationships , 2010, ECCV.

[6]  James M. Rehg,et al.  CENTRIST: A Visual Descriptor for Scene Categorization , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Changhai Xu,et al.  Real-time indoor scene understanding using Bayesian filtering with motion cues , 2011, 2011 International Conference on Computer Vision.

[8]  Fernando De la Torre,et al.  Unsupervised discovery of facial events , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9]  Zhihong Zeng,et al.  A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Javier Hernandez,et al.  Mood meter: counting smiles in the wild , 2012, UbiComp.

[11]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[12]  Sigal G. Barsade,et al.  Mood and Emotions in Small Groups and Work Teams , 2001 .

[13]  Takeo Kanade,et al.  Comprehensive database for facial expression analysis , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[14]  Sridha Sridharan,et al.  In the Pursuit of Effective Affective Computing: The Relationship Between Features and Registration , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[15]  David J. Kriegman,et al.  Urban tribes: Analyzing group photos from a social perspective , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[16]  Andrew Zisserman,et al.  Representing shape with a spatial pyramid kernel , 2007, CIVR '07.

[17]  Robert T. Collins,et al.  Vision-Based Analysis of Small Groups in Pedestrian Crowds , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Maja Pantic,et al.  Web-based database for facial expression analysis , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[19]  Andrea Kleinsmith,et al.  Affective Body Expression Perception and Recognition: A Survey , 2013, IEEE Transactions on Affective Computing.

[20]  P. Perona,et al.  What do we perceive in a glance of a real-world scene? , 2007, Journal of vision.

[21]  Daniel McDuff,et al.  Affectiva-MIT Facial Expression Dataset (AM-FED): Naturalistic and Spontaneous Facial Expressions Collected "In-the-Wild" , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[22]  Gwen Littlewort,et al.  Multiple kernel learning for emotion recognition in the wild , 2013, ICMI '13.

[23]  Jiwen Lu,et al.  Neighborhood repulsed metric learning for kinship verification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Alexander C. Berg,et al.  Hipster Wars: Discovering Elements of Fashion Styles , 2014, ECCV.

[25]  Tamás D. Gedeon,et al.  Automatic Group Happiness Intensity Analysis , 2015, IEEE Transactions on Affective Computing.

[26]  Hazim Kemal Ekenel,et al.  Facial action unit detection using kernel partial least squares , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[27]  M. Bar Visual objects in context , 2004, Nature Reviews Neuroscience.

[28]  Maja Pantic,et al.  Facial Action Unit Detection using Probabilistic Actively Learned Support Vector Machines on Tracked Facial Point Data , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Workshops.

[29]  Roland Göcke,et al.  Finding Happiest Moments in a Social Context , 2012, ACCV.

[30]  Antonio Torralba,et al.  Statistical Context Priming for Object Detection , 2001, ICCV.

[31]  Deva Ramanan,et al.  Face detection, pose estimation, and landmark localization in the wild , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Trevor Darrell,et al.  Autotagging Facebook: Social network context improves photo annotation , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[33]  Sigal G. Barsade,et al.  Group emotion: A view from top and bottom. , 1998 .

[34]  Nicu Sebe,et al.  Looking at the viewer: analysing facial activity to detect personal highlights of multimedia contents , 2010, Multimedia Tools and Applications.

[35]  Gwen Littlewort,et al.  The computer expression recognition toolbox (CERT) , 2011, Face and Gesture 2011.

[36]  Andrew C. Gallagher,et al.  Understanding images of groups of people , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Gwen Littlewort,et al.  Toward Practical Smile Detection , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[39]  Tamás D. Gedeon,et al.  Collecting Large, Richly Annotated Facial-Expression Databases from Movies , 2012, IEEE MultiMedia.

[40]  Rong Jin,et al.  Multi-label Multiple Kernel Learning by Stochastic Approximation: Application to Visual Object Recognition , 2010, NIPS.

[41]  R. Scheines,et al.  Organizational Behavior and Human Decision Processes , 1977 .