The More You Know: Using Knowledge Graphs for Image Classification

One characteristic that sets humans apart from modern learning-based computer vision algorithms is the ability to acquire knowledge about the world and use that knowledge to reason about the visual world. Humans can learn about the characteristics of objects and the relationships that occur between them to learn a large variety of visual concepts, often with few examples. This paper investigates the use of structured prior knowledge in the form of knowledge graphs and shows that using this knowledge improves performance on image classification. We build on recent work on end-to-end learning on graphs, introducing the Graph Search Neural Network as a way of efficiently incorporating large knowledge graphs into a vision classification pipeline. We show in a number of experiments that our method outperforms standard neural network baselines for multi-label classification.

[1]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[2]  John D. Lafferty,et al.  Diffusion Kernels on Graphs and Other Discrete Input Spaces , 2002, ICML.

[3]  Hans-Peter Kriegel,et al.  Shortest-path kernels on graphs , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[4]  F. Scarselli,et al.  A new model for learning in graph domains , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[5]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[6]  Luc Van Gool,et al.  The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.

[7]  Franco Scarselli,et al.  A Comparison between Recursive Neural Networks and Graph Neural Networks , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[8]  S. V. N. Vishwanathan,et al.  Graph kernels , 2007 .

[9]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[10]  Alexei A. Efros,et al.  Beyond Categories: The Visual Memex Model for Reasoning About Object Relationships , 2009, NIPS.

[11]  Kurt Mehlhorn,et al.  Efficient graphlet kernels for large graph comparison , 2009, AISTATS.

[12]  Ali Farhadi,et al.  Describing objects by their attributes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Alessio Micheli,et al.  Neural Network for Graphs: A Contextual Constructive Approach , 2009, IEEE Transactions on Neural Networks.

[14]  Estevam R. Hruschka,et al.  Toward an Architecture for Never-Ending Language Learning , 2010, AAAI.

[15]  Tom M. Mitchell,et al.  Random Walk Inference and Learning in A Large Scale Knowledge Base , 2011, EMNLP.

[16]  Kurt Mehlhorn,et al.  Weisfeiler-Lehman Graph Kernels , 2011, J. Mach. Learn. Res..

[17]  Kun Duan,et al.  Discovering localized attributes for fine-grained recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Abhinav Gupta,et al.  Constrained Semi-Supervised Learning Using Attributes and Comparative Attributes , 2012, ECCV.

[19]  Joshua B. Tenenbaum,et al.  One-shot learning by inverting a compositional causal process , 2013, NIPS.

[20]  Xinlei Chen,et al.  NEIL: Extracting Visual Knowledge from Web Data , 2013, 2013 IEEE International Conference on Computer Vision.

[21]  Dragomir Anguelov,et al.  Capturing Long-Tail Distributions of Object Subcategories , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Joan Bruna,et al.  Spectral Networks and Locally Connected Networks on Graphs , 2013, ICLR.

[23]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[24]  Christoph H. Lampert,et al.  Attribute-Based Classification for Zero-Shot Visual Object Categorization , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Li Fei-Fei,et al.  Reasoning about Object Affordances in a Knowledge Base Representation , 2014, ECCV.

[26]  Pinar Yanardag,et al.  Deep Graph Kernels , 2015, KDD.

[27]  Li Fei-Fei,et al.  Building a Large-scale Multimodal Knowledge Base System for Answering Visual Queries , 2015 .

[28]  Luc De Raedt,et al.  Graph Invariant Kernels , 2015, IJCAI.

[29]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Ali Farhadi,et al.  VisKE: Visual knowledge extraction and question answering by visual verification of relation phrases , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Alán Aspuru-Guzik,et al.  Convolutional Networks on Graphs for Learning Molecular Fingerprints , 2015, NIPS.

[32]  Michael S. Bernstein,et al.  Image retrieval using scene graphs , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[34]  Alan L. Yuille,et al.  One Shot Learning via Compositions of Meaningful Patches , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[35]  Joan Bruna,et al.  Deep Convolutional Networks on Graph-Structured Data , 2015, ArXiv.

[36]  John Miller,et al.  Traversing Knowledge Graphs in Vector Space , 2015, EMNLP.

[37]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[38]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[39]  Richard S. Zemel,et al.  Gated Graph Sequence Neural Networks , 2015, ICLR.

[40]  Martial Hebert,et al.  Learning to Learn: Model Regression Networks for Easy Small Sample Learning , 2016, ECCV.

[41]  Ross B. Girshick,et al.  Seeing through the Human Reporting Bias: Visual Classifiers from Noisy Human-Centric Labels , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Mathias Niepert,et al.  Learning Convolutional Neural Networks for Graphs , 2016, ICML.

[43]  Michael S. Bernstein,et al.  Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations , 2016, International Journal of Computer Vision.