Probabilistic Spatial Distribution Prior Based Attentional Keypoints Matching Network

Keypoints matching is a pivotal component for many image-relevant applications such as image stitching, visual simultaneous localization and mapping (SLAM), and so on. Both handcrafted-based and recently emerged deep learning-based keypoints matching methods merely rely on keypoints and local features, while losing sight of other available sensors such as inertial measurement unit (IMU) in the above applications. In this paper, we demonstrate that the motion estimation from IMU integration can be used to exploit the spatial distribution prior of keypoints between images. To this end, a probabilistic perspective of attention formulation is proposed to integrate the spatial distribution prior into the attentional graph neural network naturally. With the assistance of spatial distribution prior, the effort of the network for modeling the hidden features can be reduced. Furthermore, we present a projection loss for the proposed keypoints matching network, which gives a smooth edge between matching and un-matching keypoints. Image matching experiments on visual SLAM datasets indicate the effectiveness and efficiency of the presented method.

[1]  Haibo Wang,et al.  BB-Homography: Joint Binary Features and Bipartite Graph Matching for Homography Estimation , 2015, IEEE Transactions on Circuits and Systems for Video Technology.

[2]  Shaojie Shen,et al.  VINS-Mono: A Robust and Versatile Monocular Visual-Inertial State Estimator , 2017, IEEE Transactions on Robotics.

[3]  Wenbin Li,et al.  InteriorNet: Mega-scale Multi-sensor Photo-realistic Indoor Scenes Dataset , 2018, BMVC.

[4]  Xingming Wu,et al.  Detail-Enhanced Multi-Scale Exposure Fusion in YUV Color Space , 2020, IEEE Transactions on Circuits and Systems for Video Technology.

[5]  Vincent Lepetit,et al.  TILDE: A Temporally Invariant Learned DEtector , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Björn Ommer,et al.  Deep Semantic Feature Matching , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Torsten Sattler,et al.  BAD SLAM: Bundle Adjusted Direct RGB-D SLAM , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Shiqian Wu,et al.  Single Image Brightening via Multi-Scale Exposure Fusion With Hybrid Learning , 2020, IEEE Transactions on Circuits and Systems for Video Technology.

[9]  Marco Cuturi,et al.  Sinkhorn Distances: Lightspeed Computation of Optimal Transport , 2013, NIPS.

[10]  Vladimir Kolmogorov,et al.  Feature Correspondence Via Graph Matching: Models and Global Optimization , 2008, ECCV.

[11]  Gabriel Peyré,et al.  Computational Optimal Transport , 2018, Found. Trends Mach. Learn..

[12]  J. Munkres ALGORITHMS FOR THE ASSIGNMENT AND TRANSIORTATION tROBLEMS* , 1957 .

[13]  Bin Fan,et al.  L2-Net: Deep Learning of Discriminative Patch Descriptor in Euclidean Space , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Hongxun Yao,et al.  Hierarchical semantic image matching using CNN feature pyramid , 2018, Comput. Vis. Image Underst..

[15]  Carlo Tomasi,et al.  Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Abhinav Gupta,et al.  Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17]  Junjun Jiang,et al.  Guided Locality Preserving Feature Matching for Remote Sensing Image Registration , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[18]  Krystian Mikolajczyk,et al.  Learning local feature descriptors with triplets and shallow convolutional neural networks , 2016, BMVC.

[19]  Susanto Rahardja,et al.  Hybrid Patching for a Sequence of Differently Exposed Images With Moving Objects , 2013, IEEE Transactions on Image Processing.

[20]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[21]  Andrea Vedaldi,et al.  HPatches: A Benchmark and Evaluation of Handcrafted and Learned Local Descriptors , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Ji Zhang,et al.  LOAM: Lidar Odometry and Mapping in Real-time , 2014, Robotics: Science and Systems.

[24]  Bing-Yu Chen,et al.  Matching Images With Multiple Descriptors: An Unsupervised Approach for Locally Adaptive Descriptor Selection , 2015, IEEE Transactions on Image Processing.

[25]  Yasuyuki Matsushita,et al.  GMS: Grid-Based Motion Statistics for Fast, Ultra-robust Feature Correspondence , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Juan D. Tardós,et al.  Visual-Inertial Monocular SLAM With Map Reuse , 2016, IEEE Robotics and Automation Letters.

[27]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[28]  Torsten Sattler,et al.  Quad-Networks: Unsupervised Learning to Rank for Interest Point Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Minh N. Do,et al.  CODE: Coherence Based Decision Boundaries for Feature Correspondence , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Zhan Ma,et al.  Multi-Camera Color Correction via Hybrid Histogram Matching , 2020 .

[31]  Xin Yu,et al.  SOSNet: Second Order Similarity Regularization for Local Descriptor Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[33]  Farzin Deravi,et al.  Candidate pruning for fast corner detection , 2004 .

[34]  Zhengguo Li,et al.  Accurate IMU Preintegration Using Switched Linear Systems For Autonomous Systems , 2019, 2019 IEEE Intelligent Transportation Systems Conference (ITSC).

[35]  Tomasz Malisiewicz,et al.  SuperPoint: Self-Supervised Interest Point Detection and Description , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[36]  Guoxia Xu,et al.  Dual Calibration Mechanism Based L2, p-Norm for Graph Matching , 2021, IEEE Transactions on Circuits and Systems for Video Technology.

[37]  Zhuowen Tu,et al.  Robust Point Matching via Vector Field Consensus , 2014, IEEE Transactions on Image Processing.

[38]  Torsten Sattler,et al.  D2-Net: A Trainable CNN for Joint Description and Detection of Local Features , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Zhengguo Li,et al.  Multi-scale exposure fusion via gradient domain guided image filtering , 2017, 2017 IEEE International Conference on Multimedia and Expo (ICME).

[40]  Henawy John,et al.  Accurate IMU Factor Using Switched Linear Systems for VIO , 2020, IEEE Transactions on Industrial Electronics.

[41]  Wolfram Burgard,et al.  A benchmark for the evaluation of RGB-D SLAM systems , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[42]  Vincent Lepetit,et al.  Learning to Find Good Correspondences , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[43]  Jiri Matas,et al.  Working hard to know your neighbor's margins: Local descriptor learning loss , 2017, NIPS.

[44]  Wenmin Wang,et al.  Second- and High-Order Graph Matching for Correspondence Problems , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[45]  Frank Dellaert,et al.  On-Manifold Preintegration for Real-Time Visual--Inertial Odometry , 2015, IEEE Transactions on Robotics.

[46]  Zhanyi Hu,et al.  Rejecting Mismatches by Correspondence Function , 2010, International Journal of Computer Vision.

[47]  Gabriela Csurka,et al.  R2D2: Repeatable and Reliable Detector and Descriptor , 2019, ArXiv.

[48]  Christopher Hunt,et al.  Notes on the OpenSURF Library , 2009 .

[49]  Yannis Avrithis,et al.  Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[50]  Pascal Fua,et al.  LF-Net: Learning Local Features from Images , 2018, NeurIPS.

[51]  Junjun Jiang,et al.  Locality Preserving Matching , 2017, IJCAI.

[52]  Long Quan,et al.  Learning Two-View Correspondences and Geometry Using Order-Aware Network , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[53]  Xiaochun Cao,et al.  Good match exploration using triangle constraint , 2012, Pattern Recognit. Lett..

[54]  Juan D. Tardós,et al.  ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras , 2016, IEEE Transactions on Robotics.

[55]  Gary R. Bradski,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[56]  Zhiguo Cao,et al.  NM-Net: Mining Reliable Neighbors for Robust Feature Correspondences , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[57]  Tomasz Malisiewicz,et al.  SuperGlue: Learning Feature Matching With Graph Neural Networks , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[58]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[59]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[60]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[61]  Ralph R. Martin,et al.  Regularization Based Iterative Point Match Weighting for Accurate Rigid Transformation Estimation , 2015, IEEE Transactions on Visualization and Computer Graphics.

[62]  Weiwei Sun,et al.  Attentive Context Normalization for Robust Permutation-Equivariant Learning , 2019, ArXiv.

[63]  Markus Vincze,et al.  Guided Matching Based on Statistical Optical Flow for Fast and Robust Correspondence Analysis , 2016, ECCV.

[64]  Hong Yan,et al.  Image Correspondence With CUR Decomposition-Based Graph Completion and Matching , 2020, IEEE Transactions on Circuits and Systems for Video Technology.

[65]  Rahul Sukthankar,et al.  MatchNet: Unifying feature and metric learning for patch-based matching , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[66]  Vincent Lepetit,et al.  LIFT: Learned Invariant Feature Transform , 2016, ECCV.

[67]  Ronen Basri,et al.  Feature Matching with Bounded Distortion , 2014, ACM Trans. Graph..

[68]  Krystian Mikolajczyk,et al.  Key.Net: Keypoint Detection by Handcrafted and Learned CNN Filters , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[69]  Michael Werman,et al.  A Linear Time Histogram Metric for Improved SIFT Matching , 2008, ECCV.