Learning Tree-structured Quantizers for Image Categorization

Current state-of-the-art image categorization systems rely on bag-of-words representations that model image content as a histogram of quantization indices that code local image appearance. In this context, randomized tree-structured quantizers have been shown to be both computationally efficient and yielding discriminative visual words for a given categorization task. This paper presents a new algorithm that builds tree-structured quantizers not to optimize patch classification ‐ as it is done by approaches such as [18] ‐ but to directly optimize the image classification performance. This approach is experimentally validated on several challenging data sets for which it outperforms other patch quantizers such as standard decision trees or k-means.

[1]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[2]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[3]  Jean Ponce,et al.  Learning mid-level features for recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[4]  Jitendra Malik,et al.  Representing and Recognizing the Visual Appearance of Materials using Three-dimensional Textons , 2001, International Journal of Computer Vision.

[5]  Guillermo Sapiro,et al.  Supervised Dictionary Learning , 2008, NIPS.

[6]  Svetlana Lazebnik,et al.  Supervised Learning of Quantizer Codebooks by Information Loss Minimization , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Florent Perronnin,et al.  Large-scale image categorization with explicit data embedding , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[8]  Mark Last,et al.  Look-Ahead Mechanism Integration in Decision Tree Induction Models , 2006, Advances in Web Intelligence and Data Mining.

[9]  Antonio Criminisi,et al.  Object categorization by learned universal visual dictionary , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[10]  Axel Pinz,et al.  Object Localization with Boosting and Weak Supervision for Generic Object Recognition , 2005, SCIA.

[11]  Bernt Schiele,et al.  Efficient Clustering and Matching for Object Class Recognition , 2006, BMVC.

[12]  Thomas G. Dietterich,et al.  Learning non-redundant codebooks for classifying complex objects , 2009, ICML '09.

[13]  Zhiwei Li,et al.  Max-Margin Dictionary Learning for Multiclass Image Categorization , 2010, ECCV.

[14]  Frédéric Jurie,et al.  Randomized Clustering Forests for Image Classification , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[16]  Manik Varma,et al.  Learning The Discriminative Power-Invariance Trade-Off , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[17]  Cor J. Veenman,et al.  Visual Word Ambiguity , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Frédéric Jurie,et al.  Sampling Strategies for Bag-of-Features Image Classification , 2006, ECCV.

[19]  Frédéric Jurie,et al.  Modeling spatial layout with fisher vectors for image categorization , 2011, 2011 International Conference on Computer Vision.

[20]  Thomas S. Huang,et al.  Supervised translation-invariant sparse coding , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[21]  Subhransu Maji,et al.  Max-margin additive classifiers for detection , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[22]  Stefano Soatto,et al.  Localizing Objects with Smart Dictionaries , 2008, ECCV.

[23]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[24]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[25]  Michael Isard,et al.  Lost in quantization: Improving particular object retrieval in large scale image databases , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[27]  Florent Perronnin,et al.  Universal and Adapted Vocabularies for Generic Visual Categorization , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  J. E. Kelley,et al.  The Cutting-Plane Method for Solving Convex Programs , 1960 .

[29]  Cordelia Schmid,et al.  Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[30]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[31]  Frédéric Jurie,et al.  Creating efficient codebooks for visual recognition , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[32]  Rong Jin,et al.  Unifying discriminative visual codebook generation with classifier training for object category recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Thorsten Joachims,et al.  A support vector method for multivariate performance measures , 2005, ICML.