Joint Hierarchical Category Structure Learning and Large-Scale Image Classification

We investigate the scalable image classification problem with a large number of categories. Hierarchical visual data structures are helpful for improving the efficiency and performance of large-scale multi-class classification. We propose a novel image classification method based on learning hierarchical inter-class structures. Specifically, we first design a fast algorithm to compute the similarity metric between categories, based on which a visual tree is constructed by hierarchical spectral clustering. Using the learned visual tree, a test sample label is efficiently predicted by searching for the best path over the entire tree. The proposed method is extensively evaluated on the ILSVRC2010 and Caltech 256 benchmark datasets. The experimental results show that our method obtains significantly better category hierarchies than other state-of-the-art visual tree-based methods and, therefore, much more accurate classification.

[1]  Shang-Hong Lai,et al.  Learning Component-Level Sparse Representation for Image and Video Categorization , 2013, IEEE Transactions on Image Processing.

[2]  Xuelong Li,et al.  Joint Multilabel Classification With Community-Aware Label Graph Learning. , 2016, IEEE transactions on image processing : a publication of the IEEE Signal Processing Society.

[3]  Jianping Fan,et al.  Quantitative Characterization of Semantic Gaps for Learning Complexity Estimation and Inference Model Selection , 2012, IEEE Transactions on Multimedia.

[4]  Asok Ray,et al.  Multimodal Task-Driven Dictionary Learning for Image Classification , 2015, IEEE Transactions on Image Processing.

[5]  Yong Luo,et al.  Tensor Canonical Correlation Analysis for Multi-View Dimension Reduction , 2015, IEEE Transactions on Knowledge and Data Engineering.

[6]  Nanning Zheng,et al.  Training inter-related classifiers for automatic image classification and annotation , 2013, Pattern Recognit..

[7]  Heng Tao Shen,et al.  Hashing on Nonlinear Manifolds , 2014, IEEE Transactions on Image Processing.

[8]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[10]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Trevor Darrell,et al.  The pyramid match kernel: discriminative classification with sets of image features , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[12]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[13]  Andrew Zisserman,et al.  Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.

[14]  Florent Perronnin,et al.  High-dimensional signature compression for large-scale image classification , 2011, CVPR 2011.

[15]  Dacheng Tao,et al.  Classification with Noisy Labels by Importance Reweighting , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Jianping Fan,et al.  Hierarchical Learning of Tree Classifiers for Large-Scale Plant Species Identification , 2015, IEEE Transactions on Image Processing.

[17]  Yang Yang,et al.  Face image classification by pooling raw features , 2014, Pattern Recognit..

[18]  Yang Liu,et al.  Manifold regularized multi-view feature selection for social image annotation , 2016, Neurocomputing.

[19]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[20]  Qingming Huang,et al.  Multi-Level Discriminative Dictionary Learning With Application to Large Scale Image Classification , 2015, IEEE Transactions on Image Processing.

[21]  Antonio Criminisi,et al.  Object categorization by learned universal visual dictionary , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[22]  Cordelia Schmid,et al.  Constructing Category Hierarchies for Visual Recognition , 2008, ECCV.

[23]  Sebastian Nowozin,et al.  On feature combination for multiclass object classification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[24]  Yong Luo,et al.  On Combining Side Information and Unlabeled Data for Heterogeneous Multi-Task Metric Learning , 2016, IJCAI.

[25]  Thomas Mensink,et al.  Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.

[26]  Daphne Koller,et al.  Discriminative learning of relaxed hierarchy for large-scale visual recognition , 2011, 2011 International Conference on Computer Vision.

[27]  Bolei Zhou,et al.  Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[28]  Alexander C. Berg,et al.  Fast and Balanced: Efficient Label Tree Learning for Large Scale Object Recognition , 2011, NIPS.

[29]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[30]  Cor J. Veenman,et al.  Kernel Codebooks for Scene Categorization , 2008, ECCV.

[31]  Dacheng Tao,et al.  Multi-View Intact Space Learning , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  RamamohanaraoKotagiri,et al.  Tensor Canonical Correlation Analysis for Multi-View Dimension Reduction , 2015 .

[33]  Xuelong Li,et al.  Joint Multilabel Classification With Community-Aware Label Graph Learning , 2016, IEEE Transactions on Image Processing.

[34]  Pietro Perona,et al.  Unsupervised learning of visual taxonomies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Lawrence D. Jackel,et al.  Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.

[36]  Ohad Shamir,et al.  Probabilistic Label Trees for Efficient Large Scale Image Classification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Pietro Perona,et al.  Learning and using taxonomies for fast visual categorization , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  Alfred O. Hero,et al.  Efficient learning of sparse, distributed, convolutional feature representations for object recognition , 2011, 2011 International Conference on Computer Vision.

[39]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[40]  Yong Luo,et al.  Large Margin Multi-Modal Multi-Task Feature Extraction for Image Classification , 2019, IEEE Transactions on Image Processing.

[41]  Rama Chellappa,et al.  DASH-N: Joint Hierarchical Domain Adaptation and Feature Learning , 2015, IEEE Transactions on Image Processing.

[42]  Jianping Fan,et al.  Jointly Learning Visually Correlated Dictionaries for Large-Scale Visual Recognition Applications , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[44]  Takumi Kobayashi,et al.  BFO Meets HOG: Feature Extraction Based on Histograms of Oriented p.d.f. Gradients for Image Classification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[45]  Yong Luo,et al.  Multiview Vector-Valued Manifold Regularization for Multilabel Image Classification , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[46]  Jianping Fan,et al.  Cost-sensitive learning of hierarchical tree classifiers for large-scale image classification and novel category detection , 2015, Pattern Recognit..

[47]  Yi Ma,et al.  Learning Category-Specific Dictionary and Shared Dictionary for Fine-Grained Image Categorization , 2014, IEEE Transactions on Image Processing.

[48]  Alexei A. Efros,et al.  Unsupervised discovery of visual object class hierarchies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[49]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[50]  C. Fellbaum An Electronic Lexical Database , 1998 .

[51]  Baoxin Li,et al.  Discriminative affine sparse codes for image classification , 2011, CVPR 2011.

[52]  TaoDacheng,et al.  Large-Margin Multi-ViewInformation Bottleneck , 2014 .

[53]  Thomas S. Huang,et al.  Image Classification Using Super-Vector Coding of Local Image Descriptors , 2010, ECCV.

[54]  Dacheng Tao,et al.  Large-Margin Multi-ViewInformation Bottleneck , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[55]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[56]  Jianfei Cai,et al.  Compact Representation for Image Classification: To Choose or to Compress? , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[57]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[58]  Silvio Savarese,et al.  Find the Best Path: An Efficient and Accurate Classifier for Image Hierarchies , 2013, 2013 IEEE International Conference on Computer Vision.

[59]  Jason Weston,et al.  Label Embedding Trees for Large Multi-Class Tasks , 2010, NIPS.

[60]  Vishal M. Patel,et al.  Joint Hierarchical Domain Adaptation and Feature Learning , 2013 .

[61]  Jianping Fan,et al.  Structured Max-Margin Learning for Inter-Related Classifier Training and Multilabel Image Annotation , 2011, IEEE Transactions on Image Processing.

[62]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[63]  Xiu-Shen Wei,et al.  Deep Spatial Pyramid: The Devil is Once Again in the Details , 2015, ArXiv.

[64]  Nanning Zheng,et al.  Learning group-based dictionaries for discriminative image representation , 2014, Pattern Recognit..

[65]  Yi Xie,et al.  Evaluation of local features and classifiers in BOW model for image classification , 2012, Multimedia Tools and Applications.

[66]  G. Griffin,et al.  Caltech-256 Object Category Dataset , 2007 .

[67]  Ming Yang,et al.  Large-scale image classification: Fast feature extraction and SVM training , 2011, CVPR 2011.