Hierarchical Image Classification using Entailment Cone Embeddings

Image classification has been studied extensively, but there has been limited work in using unconventional, external guidance other than traditional image-label pairs for training. We present a set of methods for leveraging information about the semantic hierarchy embedded in class labels. We first inject label-hierarchy knowledge into an arbitrary CNN-based classifier and empirically show that availability of such external semantic information in conjunction with the visual semantics from images boosts overall performance. Taking a step further in this direction, we model more explicitly the label-label and label-image interactions using order-preserving embeddings governed by both Euclidean and hyperbolic geometries, prevalent in natural language, and tailor them to hierarchical image classification and representation learning. We empirically validate all the models on the hierarchical ETHEC dataset.

[1]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[2]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Taghi M. Khoshgoftaar,et al.  Resampling or Reweighting: A Comparison of Boosting Implementations , 2008, 2008 20th IEEE International Conference on Tools with Artificial Intelligence.

[4]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[5]  Wenxi Wu,et al.  Fine-Grained Representation Learning and Recognition by Exploiting Hierarchical Semantic Embedding , 2018, ACM Multimedia.

[6]  Marco Cuturi,et al.  Generalizing Point Embeddings using the Wasserstein Space of Elliptical Distributions , 2018, NeurIPS.

[7]  Marc'Aurelio Ranzato,et al.  DeViSE: A Deep Visual-Semantic Embedding Model , 2013, NIPS.

[8]  Rui Zheng,et al.  Hierarchical Category Detector for Clothing Recognition from Visual Data , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[9]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Sanja Fidler,et al.  Order-Embeddings of Images and Language , 2015, ICLR.

[11]  David J. Fleet,et al.  VSE++: Improving Visual-Semantic Embeddings with Hard Negatives , 2017, BMVC.

[12]  Tassilo Klein,et al.  Learning Graph-Based Priors for Generalized Zero-Shot Learning , 2020, ArXiv.

[13]  Thomas Hofmann,et al.  Hyperbolic Entailment Cones for Learning Hierarchical Embeddings , 2018, ICML.

[14]  Jonathan Krause,et al.  Hedging your bets: Optimizing accuracy-specificity trade-offs in large scale visual recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Joachim Denzler,et al.  Hierarchy-Based Image Embeddings for Semantic Image Retrieval , 2018, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[16]  Thomas Hofmann,et al.  Hyperbolic Neural Networks , 2018, NeurIPS.

[17]  Yoshua Bengio,et al.  Hierarchical Probabilistic Neural Network Language Model , 2005, AISTATS.

[18]  Samy Bengio,et al.  Large-Scale Object Classification Using Label Relation Graphs , 2014, ECCV.

[19]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[20]  Xiang Li,et al.  Smoothing the Geometry of Probabilistic Box Embeddings , 2018, ICLR.

[21]  Matt Le,et al.  Inferring Concept Hierarchies from Text Corpora via Hyperbolic Embeddings , 2019, ACL.

[22]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[23]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[24]  Ryusuke Takahama,et al.  Hyperbolic Disk Embeddings for Directed Acyclic Graphs , 2019, ICML.