Hierarchical Novelty Detection for Visual Object Recognition

Deep neural networks have achieved impressive success in large-scale visual object recognition tasks with a predefined set of classes. However, recognizing objects of novel classes unseen during training still remains challenging. The problem of detecting such novel classes has been addressed in the literature, but most prior works have focused on providing simple binary or regressive decisions, e.g., the output would be "known," "novel," or corresponding confidence intervals. In this paper, we study more informative novelty detection schemes based on a hierarchical classification framework. For an object of a novel class, we aim for finding its closest super class in the hierarchical taxonomy of known classes. To this end, we propose two different approaches termed top-down and flatten methods, and their combination as well. The essential ingredients of our methods are confidence-calibrated classifiers, data relabeling, and the leave-one-out strategy for modeling novel classes under the hierarchical taxonomy. Furthermore, our method can generate a hierarchical embedding that leads to improved generalized zero-shot learning performance in combination with other commonly-used semantic embeddings.

[1]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2]  Victoria J. Hodge,et al.  A Survey of Outlier Detection Methodologies , 2004, Artificial Intelligence Review.

[3]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[4]  Wei-Lun Chao,et al.  Synthesized Classifiers for Zero-Shot Learning , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Andrew Y. Ng,et al.  Zero-Shot Learning Through Cross-Modal Transfer , 2013, NIPS.

[6]  Christoph H. Lampert,et al.  Attribute-Based Classification for Zero-Shot Visual Object Categorization , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Samy Bengio,et al.  Large-Scale Object Classification Using Label Relation Graphs , 2014, ECCV.

[8]  Kibok Lee,et al.  Training Confidence-calibrated Classifiers for Detecting Out-of-Distribution Samples , 2017, ICLR.

[9]  Marc'Aurelio Ranzato,et al.  DeViSE: A Deep Visual-Semantic Embedding Model , 2013, NIPS.

[10]  Yarin Gal,et al.  Dropout Inference in Bayesian Neural Networks with Alpha-divergences , 2017, ICML.

[11]  Raúl Santos-Rodríguez,et al.  Hierarchical Novelty Detection , 2017, IDA.

[12]  David A. Clifton,et al.  A review of novelty detection , 2014, Signal Process..

[13]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[14]  Bernt Schiele,et al.  Learning Deep Representations of Fine-Grained Visual Descriptions , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[16]  H LampertChristoph,et al.  Attribute-Based Classification for Zero-Shot Visual Object Categorization , 2014 .

[17]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[18]  Bolei Zhou,et al.  Open Vocabulary Scene Parsing , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[19]  Fei-Fei Li,et al.  What Does Classifying More Than 10, 000 Image Categories Tell Us? , 2010, ECCV.

[20]  Terrance E. Boult,et al.  Towards Open Set Deep Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Samy Bengio,et al.  Zero-Shot Learning by Convex Combination of Semantic Embeddings , 2013, ICLR.

[22]  Kevin Gimpel,et al.  A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks , 2016, ICLR.

[23]  Pietro Perona,et al.  Caltech-UCSD Birds 200 , 2010 .

[24]  Robinson Piramuthu,et al.  HD-CNN: Hierarchical Deep Convolutional Neural Networks for Large Scale Visual Recognition , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[25]  R. Srikant,et al.  Enhancing The Reliability of Out-of-distribution Image Detection in Neural Networks , 2017, ICLR.

[26]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Christoph H. Lampert,et al.  Zero-Shot Learning—A Comprehensive Evaluation of the Good, the Bad and the Ugly , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Charles Blundell,et al.  Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.

[29]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Wei-Lun Chao,et al.  An Empirical Study and Analysis of Generalized Zero-Shot Learning for Object Recognition in the Wild , 2016, ECCV.

[31]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[32]  Yanwei Fu,et al.  Semi-supervised Vocabulary-Informed Learning , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Bernt Schiele,et al.  Evaluation of output embeddings for fine-grained image classification , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Bernt Schiele,et al.  Transfer Learning in a Transductive Setting , 2013, NIPS.

[35]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[36]  Tao Xiang,et al.  Learning a Deep Embedding Model for Zero-Shot Learning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Jonathan Krause,et al.  Hedging your bets: Optimizing accuracy-specificity trade-offs in large scale visual recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.