Role of Group Level Affect to Find the Most Influential Person in Images

Group affect analysis is an important cue for predicting various group traits. Generally, the estimation of the group affect, emotional responses, eye gaze and position of people in images are the important cues to identify an important person from a group of people. The main focus of this paper is to explore the importance of group affect in finding the representative of a group. We call that person the “Most Influential Person” (for the first impression) or “leader” of a group. In order to identify the main visual cues for “Most Influential Person”, we conducted a user survey. Based on the survey statistics, we annotate the “influential persons” in 1000 images of Group AFfect database (GAF 2.0) via LabelMe toolbox and propose the “GAF-personage database”. In order to identify “Most Influential Person”, we proposed a DNN based Multiple Instance Learning (Deep MIL) method which takes deep facial features as input. To leverage the deep facial features, we first predict the individual emotion probabilities via CapsNet and rank the detected faces on the basis of it. Then, we extract deep facial features of the top-3 faces via VGG-16 network. Our method performs better than maximum facial area and saliency-based importance methods and achieves the human-level perception of “Most Influential Person” at group-level.

[1]  Yu Qiao,et al.  Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks , 2016, IEEE Signal Processing Letters.

[2]  Quoc V. Le,et al.  Swish: a Self-Gated Activation Function , 2017, 1710.05941.

[3]  F. Redl,et al.  Group Emotion and Leadership , 1942 .

[4]  Paul A. Viola,et al.  Multiple Instance Boosting for Object Detection , 2005, NIPS.

[5]  Zhi-Hua Zhou,et al.  Neural Networks for Multi-Instance Learning , 2002 .

[6]  Andrew C. Gallagher,et al.  VIP: Finding important people in images , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Antonio Torralba,et al.  LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[8]  Zhuowen Tu,et al.  MILCut: A Sweeping Line Multiple Instance Learning Paradigm for Interactive Image Segmentation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Zhuowen Tu,et al.  Unsupervised object class discovery via saliency-guided multiple class learning , 2012, CVPR.

[10]  Hatice Gunes,et al.  Alone versus In-a-group: A Comparative Analysis of Facial Affect Recognition , 2016, ACM Multimedia.

[11]  Karl Stratos,et al.  Understanding and predicting importance in images , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Nicu Sebe,et al.  Automatic Group Affect Analysis in Images via Visual Attribute and Feature Networks , 2018, 2018 25th IEEE International Conference on Image Processing (ICIP).

[13]  Geoffrey E. Hinton,et al.  Dynamic Routing Between Capsules , 2017, NIPS.

[14]  Jesse Hoey,et al.  From individual to group-level emotion recognition: EmotiW 5.0 , 2017, ICMI.

[15]  Roland Göcke,et al.  Finding Happiest Moments in a Social Context , 2012, ACCV.

[16]  Tamás D. Gedeon,et al.  Automatic Group Happiness Intensity Analysis , 2015, IEEE Transactions on Affective Computing.

[17]  Thomas Hofmann,et al.  Support Vector Machines for Multiple-Instance Learning , 2002, NIPS.

[18]  Wei-Shi Zheng,et al.  PersonRank: Detecting Important People in Images , 2017, 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018).

[19]  Robert T. Collins,et al.  Vision-Based Analysis of Small Groups in Pedestrian Crowds , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Andrew C. Gallagher,et al.  Understanding images of groups of people , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Junping Du,et al.  Reliable Crowdsourcing and Deep Locality-Preserving Learning for Expression Recognition in the Wild , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Kristen Grauman,et al.  Learning the Relative Importance of Objects from Tagged Images for Retrieval and Cross-Modal Search , 2011, International Journal of Computer Vision.

[23]  Pietro Perona,et al.  Measuring and Predicting Object Importance , 2011, International Journal of Computer Vision.

[24]  Artur S. d'Avila Garcez,et al.  Multi-instance learning using recurrent neural networks , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[25]  Eliot R. Smith,et al.  Can emotions be truly group level? Evidence regarding four conceptual criteria. , 2007, Journal of personality and social psychology.

[26]  Javier Hernandez,et al.  Mood meter: counting smiles in the wild , 2012, UbiComp.

[27]  Kun Luo,et al.  Visualization of vortex shedding and particle dispersion in two-phase plate wake , 2005, J. Vis..

[28]  Matti Pietikäinen,et al.  Riesz-based Volume Local Binary Pattern and A Novel Group Expression Model for Group Happiness Intensity Analysis , 2015, BMVC.

[29]  Jiajun Wu,et al.  Deep multiple instance learning for image classification and auto-annotation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Qi Zhao,et al.  Saliency in Crowd , 2014, ECCV.

[31]  Li Fei-Fei,et al.  Detecting Events and Key Actors in Multi-person Videos , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Laurent Itti,et al.  Interesting objects are visually salient. , 2008, Journal of vision.

[33]  Jeffrey F. Cohn,et al.  FACSCaps: Pose-Independent Facial Action Coding with Capsules , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[34]  Christof Koch,et al.  Image Signature: Highlighting Sparse Salient Regions , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Thomas G. Dietterich,et al.  Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..

[36]  Yan Xu,et al.  Deep learning of feature representation with multiple instance learning for medical image analysis , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[37]  Sigal G. Barsade,et al.  Group emotion: A view from top and bottom. , 1998 .

[38]  Xiaohui Xie,et al.  Deep Multi-instance Networks with Sparse Label Assignment for Whole Mammogram Classification , 2016, bioRxiv.

[39]  Max Welling,et al.  Attention-based Deep Multiple Instance Learning , 2018, ICML.

[40]  Pietro Perona,et al.  Graph-Based Visual Saliency , 2006, NIPS.