Object Instance Sharing by Enhanced Bounding Box Correspondence

Most contemporary object detection approaches assume each object instance in the training data to be uniquely represented by a single bounding box. In this paper, we go beyond this conventional view by allowing an object instance to be described by multiple bounding boxes. The new bounding box annotations are determined based on the alignment of an object instance with the other training instances in the dataset. Our proposal enables the training data to be reused multiple times for training richer multi-component category models. We operationalize this idea by two complementary operations: bounding box shrinking, which finds subregions of an object instance that could be shared; and bounding box enlarging, which enlarges object instances to include local contextual cues. We empirically validate our approach on the PASCAL VOC detection dataset.

[1]  Alexei A. Efros,et al.  An empirical study of context in object detection , 2009, CVPR.

[2]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[3]  Alexei A. Efros,et al.  An empirical study of context in object detection , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  George Toderici,et al.  Discriminative tag learning on YouTube videos with latent sub-tags , 2011, CVPR 2011.

[5]  Ali Farhadi,et al.  Recognition using visual phrases , 2011, CVPR 2011.

[6]  Vittorio Ferrari,et al.  Exploiting spatial overlap to efficiently compute appearance distances between image windows , 2011, NIPS 2011.

[7]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[8]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[9]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[10]  Arnold W. M. Smeulders,et al.  The Visual Extent of an Object , 2011, International Journal of Computer Vision.

[11]  Dean A. Pomerleau,et al.  Neural Network Perception for Mobile Robot Guidance , 1993 .

[12]  Daphna Weinshall,et al.  Exploiting Object Hierarchy: Combining Models from Different Category Levels , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[13]  Gang Wang,et al.  Comparative object similarity for improved recognition with few or no examples , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[14]  Alexei A. Efros,et al.  How Important Are "Deformable Parts" in the Deformable Parts Model? , 2012, ECCV Workshops.

[15]  Charless C. Fowlkes,et al.  Multiresolution Models for Object Detection , 2010, ECCV.

[16]  Ivan Laptev,et al.  Improvements of Object Detection Using Boosted Histograms , 2006, BMVC.

[17]  Daphne Koller,et al.  Discriminative learning of relaxed hierarchy for large-scale visual recognition , 2011, 2011 International Conference on Computer Vision.

[18]  Paul A. Viola,et al.  Multiple Instance Boosting for Object Detection , 2005, NIPS.

[19]  Xiaofeng Ren,et al.  Discriminative Mixture-of-Templates for Viewpoint Classification , 2010, ECCV.

[20]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[21]  Tsuhan Chen,et al.  Extracting adaptive contextual cues from unlabeled regions , 2011, 2011 International Conference on Computer Vision.

[22]  Dariu Gavrila,et al.  Virtual sample generation for template-based shape matching , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[23]  Antonio Torralba,et al.  Semantic Label Sharing for Learning with Many Categories , 2010, ECCV.

[24]  David A. McAllester,et al.  Object Detection with Grammar Models , 2011, NIPS.

[25]  Lior Wolf,et al.  A Critical View of Context , 2006, International Journal of Computer Vision.

[26]  Antonio Torralba,et al.  Transfer Learning by Borrowing Examples for Multiclass Object Detection , 2011, NIPS.

[27]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Joshua B. Tenenbaum,et al.  Learning to share visual appearance for multiclass object detection , 2011, CVPR 2011.

[29]  Andrew Zisserman,et al.  An Exemplar Model for Learning Object Classes , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.