Ordinal pyramid pooling for rotation invariant object recognition

Local feature descriptor plays a fundamental role in many visual tasks, and its rotation invariance is a key issue for many recognition and detection problems. This paper proposes a novel rotation invariant descriptor by ordinal pyramid pooling of local Fourier transform features based on their radial gradient orientations. Since both the low-level feature and pooling strategy are rotation invariant, the obtained descriptor is rotation invariant by nature. Pooling based on orders of gradient orientations is not only invariant to in-plane rotation, but also encodes gradient orientation information into descriptor as well as spatial information to some extent. Moreover, these information is enhanced by the proposed pyramid pooling structure. Therefore, our method is naturally rotation invariant and has strong discriminative ability. Experimental results on the aerial car dataset demonstrate the effectiveness of our descriptor.

[1]  Bernd Girod,et al.  Unified Real-Time Tracking and Recognition with Rotation-Invariant Fast Features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[2]  Mei-Chen Yeh,et al.  Fast Human Detection Using a Cascade of Histograms of Oriented Gradients , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[3]  Xian Sun,et al.  Object Detection in High-Resolution Remote Sensing Images Using Rotation Invariant Parts Based Model , 2014, IEEE Geoscience and Remote Sensing Letters.

[4]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[5]  Daphne Koller,et al.  Learning Spatial Context: Using Stuff to Find Things , 2008, ECCV.

[6]  Kun Liu,et al.  Rotation-Invariant HOG Descriptors Using Fourier Analysis in Polar and Spherical Coordinates , 2014, International Journal of Computer Vision.

[7]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[9]  FanBin,et al.  Rotationally Invariant Descriptors Using Intensity Order Pooling , 2012 .

[10]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[11]  Cordelia Schmid,et al.  A sparse texture representation using local affine regions , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[13]  Zhanyi Hu,et al.  This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTION ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 1 Rotationally Invariant Descript , 2011 .

[14]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[15]  Zhanyi Hu,et al.  Aggregating gradient distributions into intensity orders: A novel local image descriptor , 2011, CVPR 2011.

[16]  Matti Pietikäinen,et al.  Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Andrea Vedaldi,et al.  Vlfeat: an open and portable library of computer vision algorithms , 2010, ACM Multimedia.

[18]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.