Global and efficient self-similarity for object classification and detection

Self-similarity is an attractive image property which has recently found its way into object recognition in the form of local self-similarity descriptors [5, 6, 14, 18, 23, 27] In this paper we explore global self-similarity (GSS) and its advantages over local self-similarity (LSS). We make three contributions: (a) we propose computationally efficient algorithms to extract GSS descriptors for classification. These capture the spatial arrangements of self-similarities within the entire image; (b) we show how to use these descriptors efficiently for detection in a sliding-window framework and in a branch-and-bound framework; (c) we experimentally demonstrate on Pascal VOC 2007 and on ETHZ Shape Classes that GSS outperforms LSS for both classification and detection, and that GSS descriptors are complementary to conventional descriptors such as gradients or color.

[1]  N. Ahmed,et al.  Discrete Cosine Transform , 1996 .

[2]  Franklin C. Crow,et al.  Summed-area tables for texture mapping , 1984, SIGGRAPH.

[3]  O. Barndorff-Nielsen,et al.  Approximating exponential models , 1989 .

[4]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[5]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[6]  Michal Irani,et al.  Detecting Irregularities in Images and in Video , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[7]  Frédéric Jurie,et al.  Creating efficient codebooks for visual recognition , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[8]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[9]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[10]  Luc Van Gool,et al.  Object Detection by Contour Segment Networks , 2006, ECCV.

[11]  Frédéric Jurie,et al.  Fast Discriminative Visual Codebooks using Randomized Clustering Forests , 2006, NIPS.

[12]  Cordelia Schmid,et al.  Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[13]  Antonio Criminisi,et al.  TextonBoost for Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context , 2007, International Journal of Computer Vision.

[14]  Eli Shechtman,et al.  Matching Local Self-Similarities across Images and Videos , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Frédéric Jurie,et al.  Groups of Adjacent Contour Segments for Object Detection , 2008, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Rainer Lienhart,et al.  Deep networks for image retrieval on large-scale databases , 2008, ACM Multimedia.

[17]  Patrick Pérez,et al.  Cross-View Action Recognition from Temporal Self-similarities , 2008, ECCV.

[18]  Eli Shechtman,et al.  In defense of Nearest-Neighbor based image classification , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[20]  Christoph H. Lampert,et al.  Learning to detect unseen object classes by between-class attribute transfer , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Michael Goesele,et al.  A shape-based object class model for knowledge transfer , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[22]  Christoph H. Lampert,et al.  Efficient Subwindow Search: A Branch and Bound Framework for Object Localization , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Christopher Hunt,et al.  Notes on the OpenSURF Library , 2009 .

[24]  Andrew Zisserman,et al.  Efficient retrieval of deformable shape classes using local self-similarities , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[25]  Sebastian Nowozin,et al.  On feature combination for multiclass object classification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[26]  Andrew Zisserman,et al.  Multiple kernels for object detection , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[27]  Jitendra Malik,et al.  Object detection using a max-margin Hough transform , 2009, CVPR.

[28]  Luc Van Gool,et al.  Feature-centric Efficient Subwindow Search , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[29]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Jianqin Zhou,et al.  On discrete cosine transform , 2011, ArXiv.