Learning to Assign Orientations to Feature Points

We show how to train a Convolutional Neural Network to assign a canonical orientation to feature points given an image patch centered on the feature point. Our method improves feature point matching upon the state-of-the art and can be used in conjunction with any existing rotation sensitive descriptors. To avoid the tedious and almost impossible task of finding a target orientation to learn, we propose to use Siamese networks which implicitly find the optimal orientations during training. We also propose a new type of activation function for Neural Networks that generalizes the popular ReLU, maxout, and PReLU activation functions. This novel activation performs better for our task. We validate the effectiveness of our method extensively with four existing datasets, including two non-planar datasets, as well as our own dataset. We show that we outperform the state-of-the-art without the need of retraining for each dataset.

[1]  Steven M. Seitz,et al.  Multicore bundle adjustment , 2011, CVPR 2011.

[2]  François Fleuret,et al.  Improving Object Classification using Pose Information , 2012 .

[3]  Nikos Komodakis,et al.  Learning to compare image patches via convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Yann LeCun,et al.  Synergistic Face Detection and Pose Estimation with Energy-Based Models , 2004, J. Mach. Learn. Res..

[5]  Kun Liu,et al.  Rotation-Invariant HOG Descriptors Using Fourier Analysis in Polar and Spherical Coordinates , 2014, International Journal of Computer Vision.

[6]  Shuning Wang,et al.  Generalization of hinging hyperplanes , 2005, IEEE Transactions on Information Theory.

[7]  Bin Fan,et al.  Local Intensity Order Pattern for feature description , 2011, 2011 International Conference on Computer Vision.

[8]  Cordelia Schmid,et al.  Scale & Affine Invariant Interest Point Detectors , 2004, International Journal of Computer Vision.

[9]  Emanuele Trucco,et al.  Improving SIFT-based Descriptors Stability to Rotations , 2010, 2010 20th International Conference on Pattern Recognition.

[10]  Razvan Pascanu,et al.  Theano: new features and speed improvements , 2012, ArXiv.

[11]  Pascal Fua,et al.  Receptive Fields Selection for Binary Feature Description , 2014, IEEE Transactions on Image Processing.

[12]  Gang Hua,et al.  Discriminative Learning of Local Image Descriptors , 1990, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Tom Drummond,et al.  Binary Histogrammed Intensity Patches for Efficient and Robust Matching , 2011, International Journal of Computer Vision.

[14]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[15]  Yann LeCun,et al.  Learning a similarity metric discriminatively, with application to face verification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[16]  Christopher Hunt,et al.  Notes on the OpenSURF Library , 2009 .

[17]  Pietro Perona,et al.  Evaluation of Features Detectors and Descriptors based on 3D Objects , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[18]  Yoshua Bengio,et al.  Maxout Networks , 2013, ICML.

[19]  Matthew A. Brown,et al.  Picking the best DAISY , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Zhanyi Hu,et al.  Aggregating gradient distributions into intensity orders: A novel local image descriptor , 2011, CVPR 2011.

[21]  Adrien Bartoli,et al.  KAZE Features , 2012, ECCV.

[22]  Q. M. Jonathan Wu,et al.  A comparative experimental study of image feature detectors and descriptors , 2015, Machine Vision and Applications.

[23]  Pierre Vandergheynst,et al.  FREAK: Fast Retina Keypoint , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Vincent Lepetit,et al.  Learning Image Descriptors with Boosting , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Vincent Lepetit,et al.  Online learning of patch perspective rectification for efficient object detection , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Brendan McCane,et al.  Better than SIFT? , 2015, Machine Vision and Applications.

[27]  Kok-Lim Low,et al.  Aligning images in the wild , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Ethan Rublee,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[29]  Domenico Tegolo,et al.  Keypoint descriptor matching with context-based orientation estimation , 2014, Image Vis. Comput..

[30]  Iasonas Kokkinos,et al.  Discriminative Learning of Deep Convolutional Feature Point Descriptors , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[31]  Cordelia Schmid,et al.  An Affine Invariant Interest Point Detector , 2002, ECCV.

[32]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[33]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[34]  Pascal Fua,et al.  On benchmarking camera calibration and multi-view stereo for high resolution imagery , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Henrik Aanæs,et al.  Interesting Interest Points , 2011, International Journal of Computer Vision.

[36]  C. Lawrence Zitnick,et al.  Binary Coherent Edge Descriptors , 2010, ECCV.

[37]  Vincent Lepetit,et al.  DAISY: An Efficient Dense Descriptor Applied to Wide-Baseline Stereo , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Andrew Zisserman,et al.  Learning Local Feature Descriptors Using Convex Optimisation , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Vincent Lepetit,et al.  TILDE: A Temporally Invariant Learned DEtector , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Changchang Wu,et al.  Towards Linear-Time Incremental Structure from Motion , 2013, 2013 International Conference on 3D Vision.

[41]  Didier Stricker,et al.  Learning Local Patch Orientation with a Cascade of Sparse Regressors , 2009, BMVC.

[42]  Cordelia Schmid,et al.  Semi-Local Affine Parts for Object Recognition , 2004, BMVC.

[43]  Cordelia Schmid,et al.  A performance evaluation of local descriptors , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  Cordelia Schmid,et al.  Human Detection Based on a Probabilistic Assembly of Robust Part Detectors , 2004, ECCV.

[45]  Yann LeCun,et al.  Signature Verification Using A "Siamese" Time Delay Neural Network , 1993, Int. J. Pattern Recognit. Artif. Intell..

[46]  Roland Siegwart,et al.  BRISK: Binary Robust invariant scalable keypoints , 2011, 2011 International Conference on Computer Vision.

[47]  Tobias Höllerer,et al.  Improving Keypoint Orientation Assignment , 2011, BMVC.

[48]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[49]  Richard Szeliski,et al.  Multi-image matching using multi-scale oriented patches , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[50]  C. Lawrence Zitnick,et al.  Edge foci interest points , 2011, 2011 International Conference on Computer Vision.

[51]  Vladimir Pekar,et al.  Full orientation invariance and improved feature selectivity of 3D SIFT with application to medical image analysis , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.