Hierarchical learning with backtracking algorithm based on the Visual Confusion Label Tree for large-scale image classification

In this paper, a hierarchical learning algorithm based on the Bayesian Neural Network classifier with backtracking is proposed to support large-scale image classification, where a Visual Confusion Label Tree is established for constructing a hierarchical structure for large numbers of categories in image datasets and determining the hierarchical learning tasks automatically. Specifically, the Visual Confusion Label Tree is established based on outputs of convolution neural network models. One parent node on the Visual Confusion Label Tree contains a set of sibling coarse-grained categories, and child nodes have several sets of fine-grained categories which are partitions of categories on the parent node. The proposed Hierarchical Bayesian Neural Network with backtracking algorithm can benefit from the hierarchical structure of the Visual Confusion Label Tree. Focusing on those confusion subsets instead of the entire set of categories makes the classification ability of the tree classifier stronger. The backtracking algorithm can utilize the uncertainty information captured from the Bayesian Neural Network to make a second classification to re-correct samples that were classified incorrectly in the previous classification process. Experiments on four large-scale datasets show that our tree classifier obtains a significant improvement over the state-of-the-art tree classifier, which have demonstrated the discriminative hierarchical structure of our Visual Confusion Label Tree and the effectiveness of our Hierarchical Bayesian Neural Network with backtracking algorithm.

[1]  Silvio Savarese,et al.  Find the Best Path: An Efficient and Accurate Classifier for Image Hierarchies , 2013, 2013 IEEE International Conference on Computer Vision.

[2]  Andrew Zisserman,et al.  An Invariant Large Margin Nearest Neighbour Classifier , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[3]  James M. Keller,et al.  A fuzzy K-nearest neighbor algorithm , 1985, IEEE Transactions on Systems, Man, and Cybernetics.

[4]  Fei-Fei Li,et al.  Building and using a semantivisual image hierarchy , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[5]  Jonathan Krause,et al.  Hedging your bets: Optimizing accuracy-specificity trade-offs in large scale visual recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[7]  Sechan Oh,et al.  Top-k Hierarchical Classification , 2017, AAAI.

[8]  Lin Xiao,et al.  Hierarchical Classification via Orthogonal Transfer , 2011, ICML.

[9]  G. Griffin,et al.  Caltech-256 Object Category Dataset , 2007 .

[10]  Shang-Hua Teng,et al.  Power SVM: Generalization with exemplar classification uncertainty , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Ming-Ming Cheng,et al.  EGNet: Edge Guidance Network for Salient Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[12]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[13]  Paul Geladi,et al.  Principal Component Analysis , 1987, Comprehensive Chemometrics.

[14]  Ohad Shamir,et al.  Probabilistic Label Trees for Efficient Large Scale Image Classification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Lin Chen,et al.  Semi-supervised dictionary learning with label propagation for image classification , 2016, Computational Visual Media.

[16]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[17]  Cordelia Schmid,et al.  Constructing Category Hierarchies for Visual Recognition , 2008, ECCV.

[18]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[19]  Eric P. Xing,et al.  Large-Scale Category Structure Aware Image Categorization , 2011, NIPS.

[20]  Sylvia Richardson,et al.  Markov Chain Monte Carlo in Practice , 1997 .

[21]  Matthieu Guillaumin,et al.  Large-scale knowledge transfer for object localization in ImageNet , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[23]  Jianping Fan,et al.  HD-MTL: Hierarchical Deep Multi-Task Learning for Large-Scale Visual Recognition , 2017, IEEE Transactions on Image Processing.

[24]  Matthieu Guillaumin,et al.  From categories to subcategories: Large-scale image classification with partial class label refinement , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[26]  Jianping Fan,et al.  Hierarchical learning of multi-task sparse metrics for large-scale image classification , 2017, Pattern Recognit..

[27]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[28]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[29]  Yong Dou,et al.  Visual Confusion Label Tree for Image Classification , 2018, 2018 IEEE International Conference on Multimedia and Expo (ICME).

[30]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Jianping Fan,et al.  Hierarchical Learning of Tree Classifiers for Large-Scale Plant Species Identification , 2015, IEEE Transactions on Image Processing.

[32]  Alexander C. Berg,et al.  Fast and Balanced: Efficient Label Tree Learning for Large Scale Object Recognition , 2011, NIPS.

[33]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[34]  Jason Weston,et al.  Label Embedding Trees for Large Multi-Class Tasks , 2010, NIPS.

[35]  Pietro Perona,et al.  The Caltech-UCSD Birds-200-2011 Dataset , 2011 .

[36]  Tianbao Yang,et al.  Hyper-class augmented and regularized deep learning for fine-grained image classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Vinod Nair,et al.  Learning hierarchical similarity metrics , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  Bernt Schiele,et al.  Evaluation of output embeddings for fine-grained image classification , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Jianping Fan,et al.  Quantitative Characterization of Semantic Gaps for Learning Complexity Estimation and Inference Model Selection , 2012, IEEE Transactions on Multimedia.

[40]  Xinbo Gao,et al.  Exploiting Related and Unrelated Tasks for Hierarchical Metric Learning and Image Classification , 2020, IEEE Transactions on Image Processing.

[41]  Jianping Fan,et al.  Hierarchical learning of large-margin metrics for large-scale image classification , 2016, Neurocomputing.

[42]  Harris Drucker,et al.  Comparison of learning algorithms for handwritten digit recognition , 1995 .

[43]  Yong Dou,et al.  Confusion Graph: Detecting Confusion Communities in Large Scale Image Classification , 2017, IJCAI.

[44]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..