Learning to Detect Local Features Using Information Change

In this paper we propose that features extracted from deep convolutional neural networks have the structure and information necessary to detect location and scale of the local keypoints. Unlike the previous supervised and unsupervised methods, we define a local feature as an outcome of information change across different receptive fields around an image region. Exploiting the existent representation hierarchy in the deep convectional neural network, we propose a trainable information accumulation pyramid that allows us to relate the change in the receptive field with information change. The network is trained in an unsupervised fashion by applying random set of transformations over the images and minimizing the covariant loss. We demonstrate the efficacy of our proposed keypoint extractor by evaluating its performance for repeatability and matching scores. Our approach results in 3.7% and 5.2% higher over the state-of-the-art algorithms in repeatability and matching score, respectively.

[1]  Jakub Nalepa,et al.  Hyperspectral Band Selection Using Attention-Based Convolutional Neural Networks , 2020, IEEE Access.

[2]  Andrea Vedaldi,et al.  Deep Image Prior , 2017, International Journal of Computer Vision.

[3]  Chenglu Wen,et al.  RF-Net: An End-To-End Image Matching Network Based on Receptive Field , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Vincent Lepetit,et al.  Learning to Assign Orientations to Feature Points , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Tom Drummond,et al.  Machine Learning for High-Speed Corner Detection , 2006, ECCV.

[6]  Adrien Bartoli,et al.  KAZE Features , 2012, ECCV.

[7]  Cordelia Schmid,et al.  A Comparison of Affine Region Detectors , 2005, International Journal of Computer Vision.

[8]  Cordelia Schmid,et al.  Scale & Affine Invariant Interest Point Detectors , 2004, International Journal of Computer Vision.

[9]  Gary R. Bradski,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[10]  Roland Siegwart,et al.  BRISK: Binary Robust invariant scalable keypoints , 2011, 2011 International Conference on Computer Vision.

[11]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[12]  Tomasz Malisiewicz,et al.  SuperPoint: Self-Supervised Interest Point Detection and Description , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[13]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[14]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[15]  Pascal Fua,et al.  LF-Net: Learning Local Features from Images , 2018, NeurIPS.

[16]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[17]  Torsten Sattler,et al.  Quad-Networks: Unsupervised Learning to Rank for Interest Point Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Andrea Vedaldi,et al.  HPatches: A Benchmark and Evaluation of Handcrafted and Learned Local Descriptors , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Andrea Vedaldi,et al.  Large scale evaluation of local image feature detectors on homography datasets , 2018, BMVC.

[21]  Shih-Fu Chang,et al.  Learning Discriminative and Transformation Covariant Local Feature Detectors , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Torsten Sattler,et al.  Comparative Evaluation of Hand-Crafted and Learned Local Features , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Gabriela Csurka,et al.  R2D2: Repeatable and Reliable Detector and Descriptor , 2019, ArXiv.

[24]  Matthieu Geist,et al.  ELF: Embedded Localisation of Features in Pre-Trained CNN , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[25]  Christopher Hunt,et al.  Notes on the OpenSURF Library , 2009 .

[26]  Vincent Lepetit,et al.  BRIEF: Binary Robust Independent Elementary Features , 2010, ECCV.

[27]  Jiri Matas,et al.  Working hard to know your neighbor's margins: Local descriptor learning loss , 2017, NIPS.

[28]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[29]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[30]  Vincent Lepetit,et al.  TILDE: A Temporally Invariant Learned DEtector , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Iasonas Kokkinos,et al.  Discriminative Learning of Deep Convolutional Feature Point Descriptors , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[32]  Sandro De Zanet,et al.  GLAMpoints: Greedily Learned Accurate Match Points , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[33]  Iasonas Kokkinos,et al.  UberNet: Training a Universal Convolutional Neural Network for Low-, Mid-, and High-Level Vision Using Diverse Datasets and Limited Memory , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  David A. Shamma,et al.  YFCC100M , 2015, Commun. ACM.

[35]  Andrea Vedaldi,et al.  Learning Covariant Feature Detectors , 2016, ECCV Workshops.

[36]  Gabriela Csurka,et al.  From handcrafted to deep local features , 2018, 1807.10254.

[37]  Bodo Rosenhahn,et al.  High-Resolution Feature Evaluation Benchmark , 2013, CAIP.

[38]  Adrien Bartoli,et al.  Fast Explicit Diffusion for Accelerated Features in Nonlinear Scale Spaces , 2013, BMVC.

[39]  C. Lawrence Zitnick,et al.  Edge foci interest points , 2011, 2011 International Conference on Computer Vision.

[40]  Torsten Sattler,et al.  D2-Net: A Trainable CNN for Joint Description and Detection of Local Features , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Jan-Michael Frahm,et al.  Reconstructing the World* in Six Days *(As Captured by the Yahoo 100 Million Image Dataset) , 2015, CVPR 2015.

[42]  Ehab Salahat,et al.  Recent advances in features extraction and description algorithms: A comprehensive survey , 2017, 2017 IEEE International Conference on Industrial Technology (ICIT).

[43]  Vincent Lepetit,et al.  LIFT: Learned Invariant Feature Transform , 2016, ECCV.

[44]  Cordelia Schmid,et al.  A performance evaluation of local descriptors , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  Krystian Mikolajczyk,et al.  Key.Net: Keypoint Detection by Handcrafted and Learned CNN Filters , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).