Adding Discriminative Power to Hierarchical Compositional Models for Object Class Detection

In recent years, hierarchical compositional models have been shown to possess many appealing properties for the object class detection such as coping with potentially large number of object categories. The reason is that they encode categories by hierarchical vocabularies of parts which are shared among the categories. On the downside, the sharing and purely reconstructive nature causes problems when categorizing visually-similar categories and separating them from the background. In this paper we propose a novel approach that preserves the appealing properties of the generative hierarchical models, while at the same time improves their discrimination properties. We achieve this by introducing a network of discriminative nodes on top of the existing generative hierarchy. The discriminative nodes are sparse linear combinations of activated generative parts. We show in the experiments that the discriminative nodes consistently improve a state-of-the-art hierarchical compositional model. Results show that our approach considers only a fraction of all nodes in the vocabulary (less than 10%) which also makes the system computationally efficient.

[1]  Guillermo Sapiro,et al.  Discriminative learned dictionaries for local image analysis , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Sanja Fidler,et al.  Evaluating multi-class learning strategies in a hierarchical framework for object detection , 2009, NIPS 2009.

[3]  Antonio Torralba,et al.  Part and appearance sharing: Recursive Compositional Models for multi-view , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[4]  Hayko Riemenschneider,et al.  Using Partial Edge Contour Matches for Efficient Object Category Localization , 2010, ECCV.

[5]  Long Zhu,et al.  Max Margin Learning of Hierarchical Configural Deformable Templates (HCDTs) for Efficient Object Parsing and Pose Estimation , 2011, International Journal of Computer Vision.

[6]  Song-Chun Zhu,et al.  Unsupervised learning of stochastic AND-OR templates for object modeling , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[7]  Narendra Ahuja,et al.  Learning subcategory relevances for category recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Masa-aki Sato,et al.  Sparse estimation automatically selects voxels relevant for the decoding of fMRI activity patterns , 2008, NeuroImage.

[9]  Y-Lan Boureau,et al.  Learning Convolutional Feature Hierarchies for Visual Recognition , 2010, NIPS.

[10]  Thomas Deselaers,et al.  ClassCut for Unsupervised Class Segmentation , 2010, ECCV.

[11]  Iasonas Kokkinos,et al.  Inference and Learning with Hierarchical Shape Models , 2011, International Journal of Computer Vision.

[12]  Pedro M. Domingos,et al.  Sum-product networks: A new deep architecture , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[13]  Jitendra Malik,et al.  Object detection using a max-margin Hough transform , 2009, CVPR.

[14]  Shimon Ullman,et al.  Combined Top-Down/Bottom-Up Segmentation , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Bernt Schiele,et al.  Robust Object Detection with Interleaved Categorization and Segmentation , 2008, International Journal of Computer Vision.

[16]  Jitendra Malik,et al.  Multi-scale object detection by clustering lines , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[17]  Jitendra Malik,et al.  A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[18]  Trevor Darrell,et al.  The Pyramid Match Kernel: Efficient Learning with Sets of Features , 2007, J. Mach. Learn. Res..

[19]  Sanja Fidler,et al.  Towards Scalable Representations of Object Categories: Learning a Hierarchy of Parts , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Andrew Zisserman,et al.  Learning an Alphabet of Shape and Appearance for Multi-Class Object Detection , 2008, International Journal of Computer Vision.

[22]  Shimon Ullman,et al.  Satellite Features for the Classification of Visually Similar Classes , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[23]  Cordelia Schmid,et al.  Bandit Algorithms for Tree Search , 2007, UAI.

[24]  Honglak Lee,et al.  Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.