Automatic Group Affect Analysis in Images via Visual Attribute and Feature Networks

This paper proposes a pipeline for automatic group-level affect analysis. A deep neural network-based approach, which leverages on the facial-expression information, scene information and a high-level facial visual attribute information is proposed. A capsule network-based architecture is used to predict the facial expression. Transfer learning is used on Inception-V3 to extract global image-based features which contain scene information. Another network is trained for inferring the facial attributes of the group members. Further, these attributes are pooled at a group-level to train a network for inferring the group-level affect. The facial attribute prediction network, although is simple yet, is effective and generates result comparable to the state-of-the-art methods. Later, model integration is performed from the three channels. The experiments show the effectiveness of the proposed techniques on three ‘in the wild’ databases: Group Affect Database, HAPPEI and UCLA-Protest database.

[1]  Junping Du,et al.  Reliable Crowdsourcing and Deep Locality-Preserving Learning for Expression Recognition in the Wild , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Xin Guo,et al.  Group-level emotion recognition using deep models on image scene, faces, and skeletons , 2017, ICMI.

[3]  Yu Qiao,et al.  Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks , 2016, IEEE Signal Processing Letters.

[4]  Mohamed R. Amer,et al.  Facial Attributes Classification Using Multi-task Representation Learning , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[5]  Daniel McDuff,et al.  Event detection: Ultra large-scale clustering of facial expressions , 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[6]  Jesse Hoey,et al.  EmotiW 2016: video and group-level emotion recognition challenges , 2016, ICMI.

[7]  Kai Wang,et al.  Group emotion recognition with individual facial emotion CNNs and global image based CNNs , 2017, ICMI.

[8]  Robert T. Collins,et al.  Vision-Based Analysis of Small Groups in Pedestrian Crowds , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Javier Hernandez,et al.  Mood meter: counting smiles in the wild , 2012, UbiComp.

[11]  Bo Sun,et al.  A new deep-learning framework for group emotion recognition , 2017, ICMI.

[12]  Stefan Winkler,et al.  Group happiness assessment using geometric features and dataset balancing , 2016, ICMI.

[13]  Javier R. Movellan,et al.  The Faces of Engagement: Automatic Recognition of Student Engagementfrom Facial Expressions , 2014, IEEE Transactions on Affective Computing.

[14]  Andrew Zisserman,et al.  Deep Face Recognition , 2015, BMVC.

[15]  Jiashi Feng,et al.  Happiness level prediction with sequential inputs via multiple regressions , 2016, ICMI.

[16]  Eliot R. Smith,et al.  Can emotions be truly group level? Evidence regarding four conceptual criteria. , 2007, Journal of personality and social psychology.

[17]  Sigal G. Barsade,et al.  Group emotion: A view from top and bottom. , 1998 .

[18]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[19]  Jungseock Joo,et al.  Protest Activity Detection and Perceived Violence Estimation from Social Media Images , 2017, ACM Multimedia.

[20]  Tamás D. Gedeon,et al.  Automatic Group Happiness Intensity Analysis , 2015, IEEE Transactions on Affective Computing.

[21]  Yang Zhong,et al.  Face attribute prediction using off-the-shelf CNN features , 2016, 2016 International Conference on Biometrics (ICB).

[22]  Matti Pietikäinen,et al.  Riesz-based Volume Local Binary Pattern and A Novel Group Expression Model for Group Happiness Intensity Analysis , 2015, BMVC.

[23]  Jesse Hoey,et al.  From individual to group-level emotion recognition: EmotiW 5.0 , 2017, ICMI.

[24]  Geoffrey E. Hinton,et al.  Dynamic Routing Between Capsules , 2017, NIPS.

[25]  Chen Huang,et al.  Learning Deep Representation for Imbalanced Classification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Pedro Antonio Gutiérrez,et al.  Ordinal Regression Methods: Survey and Experimental Study , 2016, IEEE Transactions on Knowledge and Data Engineering.

[27]  Rama Chellappa,et al.  Attributes for Improved Attributes: A Multi-Task Network Utilizing Implicit and Explicit Relationships for Facial Attribute Classification , 2017, AAAI.

[28]  Andrew C. Gallagher,et al.  Understanding images of groups of people , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Hatice Gunes,et al.  Alone versus In-a-group: A Comparative Analysis of Facial Affect Recognition , 2016, ACM Multimedia.

[30]  Bo Sun,et al.  LSTM for dynamic emotion and group emotion recognition in the wild , 2016, ICMI.

[31]  Quoc V. Le,et al.  Swish: a Self-Gated Activation Function , 2017, 1710.05941.