Learning Balanced Trees for Large Scale Image Classification

The label tree is one of the popular approaches for the problem of large scale multi-class image classification in which the number of class labels is large, for example, several tens of thousands of labels. In learning stage, class labels are organized into a hierarchical tree, in which each node is associated with a subset of class labels and a classifier that determines which branch to follow; and each leaf node is associated with a single class label. In testing stage, the fact that a test example travels from the root of the tree to a leaf node reduces the test time significantly compared to the approach of using multiple binary one-versus-all classifiers. The balance of the learned tree structure is the key essential of the label tree approach. Previous methods for learning the tree structure use clustering techniques such as k-means or spectral clustering to group confused labels into clusters associated with the nodes. However, the output tree might not be balanced. We propose a method for learning effective and balanced tree structure by jointly optimizing the balance constraint and the confusion constraint. The experimental results on the datasets such as Caltech-256, SUN-397, and ImageNet-1K show that the classification accuracy of the proposed approach outperforms that of other state of the art methods.

[1]  Thomas G. Dietterich,et al.  Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..

[2]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[3]  Andrea Vedaldi,et al.  Vlfeat: an open and portable library of computer vision algorithms , 2010, ACM Multimedia.

[4]  Sergio Escalera,et al.  Subclass Problem-Dependent Design for Error-Correcting Output Codes , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[6]  Krista A. Ehinger,et al.  SUN database: Large-scale scene recognition from abbey to zoo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[7]  James Ze Wang,et al.  Image retrieval: Ideas, influences, and trends of the new age , 2008, CSUR.

[8]  Bin Zhao,et al.  Sparse Output Coding for Large-Scale Visual Recognition , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Fei-Fei Li,et al.  Hierarchical semantic indexing for large scale image retrieval , 2011, CVPR 2011.

[10]  Fei-Fei Li,et al.  Towards total scene understanding: Classification, annotation and segmentation in an automatic framework , 2009, CVPR.

[11]  Koby Crammer,et al.  On the Learnability and Design of Output Codes for Multiclass Problems , 2002, Machine Learning.

[12]  Yoram Singer,et al.  Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers , 2000, J. Mach. Learn. Res..

[13]  Ayhan Demiriz,et al.  Constrained K-Means Clustering , 2000 .

[14]  Jason Weston,et al.  Label Embedding Trees for Large Multi-Class Tasks , 2010, NIPS.

[15]  Ryan M. Rifkin,et al.  In Defense of One-Vs-All Classification , 2004, J. Mach. Learn. Res..

[16]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[17]  Liang-Tien Chia,et al.  Adaptive hierarchical multi-class SVM classifier for texture-based image classification , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[18]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[19]  G. Griffin,et al.  Caltech-256 Object Category Dataset , 2007 .

[20]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[21]  Daphne Koller,et al.  Discriminative learning of relaxed hierarchy for large-scale visual recognition , 2011, 2011 International Conference on Computer Vision.

[22]  Alexander C. Berg,et al.  Fast and Balanced: Efficient Label Tree Learning for Large Scale Object Recognition , 2011, NIPS.

[23]  Ohad Shamir,et al.  Probabilistic Label Trees for Efficient Large Scale Image Classification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Jordi Vitrià,et al.  Discriminant ECOC: a heuristic method for application dependent design of error correcting output codes , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Xiao Zhang,et al.  Spectral error correcting output codes for efficient multiclass recognition , 2009, 2009 IEEE 12th International Conference on Computer Vision.