An End-To-End Network for Generating Social Relationship Graphs

Socially-intelligent agents are of growing interest in artificial intelligence. To this end, we need systems that can understand social relationships in diverse social contexts. Inferring the social context in a given visual scene not only involves recognizing objects, but also demands a more in-depth understanding of the relationships and attributes of the people involved. To achieve this, one computational approach for representing human relationships and attributes is to use an explicit knowledge graph, which allows for high-level reasoning. We introduce a novel end-to-end-trainable neural network that is capable of generating a Social Relationship Graph - a structured, unified representation of social relationships and attributes - from a given input image. Our Social Relationship Graph Generation Network (SRG-GN) is the first to use memory cells like Gated Recurrent Units (GRUs) to iteratively update the social relationship states in a graph using scene and attribute context. The neural network exploits the recurrent connections among the GRUs to implement message passing between nodes and edges in the graph, and results in significant improvement over previous methods for social relationship recognition.

[1]  Xiaogang Wang,et al.  Scene Graph Generation from Objects, Phrases and Region Captions , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[2]  Jie Lin,et al.  Object Detection Meets Knowledge Graphs , 2017, IJCAI.

[3]  Gang Wang,et al.  Seeing People in Social Context: Recognizing People and Social Relationships , 2010, ECCV.

[4]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[5]  C. Frith Role of facial expressions in social interactions , 2009, Philosophical Transactions of the Royal Society B: Biological Sciences.

[6]  Hui Cheng,et al.  3D Visual Proxemics: Recognizing Human Interactions in 3D from a Single Image , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Wu Liu,et al.  Multi-stream Fusion Model for Social Relation Recognition from Videos , 2018, MMM.

[8]  Michael S. Bernstein,et al.  Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations , 2016, International Journal of Computer Vision.

[9]  Fei-Fei Li,et al.  Social Role Discovery in Human Events , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Ming Shao,et al.  Visual Kinship Recognition of Families in the Wild , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[12]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[13]  Mohan S. Kankanhalli,et al.  Dual-Glance Model for Deciphering Social Relationships , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[14]  Xiaoou Tang,et al.  Learning Social Relation Traits from Face Images , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[15]  Anton van den Hengel,et al.  Graph-Structured Representations for Visual Question Answering , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Michael S. Bernstein,et al.  Visual Relationship Detection with Language Priors , 2016, ECCV.

[17]  D. Bugental,et al.  Acquisition of the algorithms of social life: a domain-based approach. , 2000, Psychological bulletin.

[18]  Jonathan Berant,et al.  Mapping Images to Scene Graphs with Permutation-Invariant Structured Prediction , 2018, NeurIPS.

[19]  Yejin Choi,et al.  Neural Motifs: Scene Graph Parsing with Global Context , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[20]  Richard S. Zemel,et al.  Gated Graph Sequence Neural Networks , 2015, ICLR.

[21]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[22]  Michael S. Bernstein,et al.  Image retrieval using scene graphs , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Bernt Schiele,et al.  A Domain Based Approach to Social Relation Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Hui Cheng,et al.  Deep Reasoning with Knowledge Graph for Social Relationship Understanding , 2018, IJCAI.

[25]  Basura Fernando,et al.  SPICE: Semantic Propositional Image Caption Evaluation , 2016, ECCV.

[26]  Seong Joon Oh,et al.  Person Recognition in Personal Photo Collections , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Eliot R. Smith,et al.  Dual-Process Models in Social and Cognitive Psychology: Conceptual Integration and Links to Underlying Memory Systems , 2000 .

[28]  Danfei Xu,et al.  Scene Graph Generation by Iterative Message Passing , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Bolei Zhou,et al.  Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[30]  Abhinav Gupta,et al.  The More You Know: Using Knowledge Graphs for Image Classification , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Tsuhan Chen,et al.  Towards computational models of kinship verification , 2010, 2010 IEEE International Conference on Image Processing.