Learnable Line Segment Descriptor for Visual SLAM

Traditionally, the indirect visual motion estimation and simultaneous localization and mapping (SLAM) systems were based on point features. In recent years, several SLAM systems that use lines as primitives were suggested. Despite the extra robustness and accuracy brought by the line segment matching, the line segment descriptors used in such systems were hand-crafted, and therefore sub-optimal. In this paper, we suggest applying descriptor learning to construct line segment descriptors optimized for matching tasks. We show how such descriptors can be constructed on top of a deep yet lightweight fully-convolutional neural network. The coefficients of this network are trained using an automatically collected dataset of matching and non-matching line segments. The use of the fully-convolutional network ensures that the bulk of the computations needed to compute descriptors is shared among the multiple line segments in the same image, enabling efficient implementation. We show that the learned line segment descriptors outperform the previously suggested hand-crafted line segment descriptors both in isolation (i.e., for the subtask of distinguishing matching and non-matching line segments), but also when built into the SLAM system. We construct a new line based SLAM pipeline built upon a state-of-the-art point-only system. We demonstrate generalization of the learned parameters of the descriptor network between two well-known datasets for autonomous driving and indoor micro aerial vehicle navigation.

[1]  Haojie Li,et al.  Novel Coplanar Line-Points Invariants for Robust Line Matching Across Views , 2016, ECCV.

[2]  Teresa A. Vidal-Calleja,et al.  Impact of Landmark Parametrization on Monocular EKF-SLAM with Points and Lines , 2011, International Journal of Computer Vision.

[3]  Rachid Deriche,et al.  Tracking line segments , 1990, Image Vis. Comput..

[4]  Lu Wang,et al.  Wide-baseline image matching using Line Signatures , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[5]  Victor S. Lempitsky,et al.  Fast ConvNets Using Group-Wise Brain Damage , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  S. Dwivedi,et al.  Obesity May Be Bad: Compressed Convolutional Networks for Biomedical Image Segmentation , 2020 .

[7]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[8]  Cordelia Schmid,et al.  A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Andrew Zisserman,et al.  Descriptor Learning Using Convex Optimisation , 2012, ECCV.

[11]  Iasonas Kokkinos,et al.  Discriminative Learning of Deep Convolutional Feature Point Descriptors , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[12]  Bin Fan,et al.  L2-Net: Deep Learning of Discriminative Patch Descriptor in Euclidean Space , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Javier Gonzalez-Jimenez,et al.  PL-SLAM: A Stereo SLAM System Through the Combination of Points and Line Segments , 2017, IEEE Transactions on Robotics.

[14]  Zhanyi Hu,et al.  MSLD: A robust descriptor for line matching , 2009, Pattern Recognit..

[15]  Jian Yao,et al.  A Monocular SLAM System Leveraging Structural Regularity in Manhattan World , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[16]  Roland Siegwart,et al.  The EuRoC micro aerial vehicle datasets , 2016, Int. J. Robotics Res..

[17]  Sukhan Lee,et al.  Simultaneous line matching and epipolar geometry estimation based on the intersection context of coplanar line pairs , 2012, Pattern Recognit. Lett..

[18]  Francesc Moreno-Noguer,et al.  PL-SLAM: Real-time monocular visual SLAM with points and lines , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[19]  Yong Li,et al.  Multimodal Image Registration With Line Segments by Selective Search , 2017, IEEE Transactions on Cybernetics.

[20]  Juan D. Tardós,et al.  ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras , 2016, IEEE Transactions on Robotics.

[21]  Gang Hua,et al.  Picking the best DAISY , 2009, CVPR.

[22]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Jiri Matas,et al.  Working hard to know your neighbor's margins: Local descriptor learning loss , 2017, NIPS.

[24]  Igor Carron,et al.  XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016 .

[25]  Yipu Zhao,et al.  Good Line Cutting: Towards Accurate Pose Tracking of Line-Assisted VO/VSLAM , 2018, ECCV.

[26]  Adrien Bartoli,et al.  Structure-from-motion using lines: Representation, triangulation, and bundle adjustment , 2005, Comput. Vis. Image Underst..

[27]  Vincent Lepetit,et al.  DAISY: An Efficient Dense Descriptor Applied to Wide-Baseline Stereo , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Jie Jiang,et al.  Automatic Registration Method for Optical Remote Sensing Images with Large Background Variations Using Line Segments , 2016, Remote. Sens..

[29]  Andrew Zisserman,et al.  Learning Local Feature Descriptors Using Convex Optimisation , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Manolis I. A. Lourakis,et al.  Matching disparate views of planar surfaces using projective invariants , 2000, Image Vis. Comput..

[31]  Vincent Lepetit,et al.  LIFT: Learned Invariant Feature Transform , 2016, ECCV.

[32]  Andrew Zisserman,et al.  Spatial Transformer Networks , 2015, NIPS.

[33]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[34]  Cuneyt Akinlar,et al.  EDLines: A real-time line segment detector with a false detection control , 2011, Pattern Recognit. Lett..

[35]  Victor S. Lempitsky,et al.  Aggregating Local Deep Features for Image Retrieval , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[36]  Thomas Brox,et al.  FlowNet: Learning Optical Flow with Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[37]  Luc Van Gool,et al.  Wide-baseline stereo matching with line segments , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[38]  Hongbin Zha,et al.  Structure-aware SLAM with planes and lines in man-made environment , 2019, Pattern Recognit. Lett..

[39]  Horst Bischof,et al.  Direct Stereo Visual Odometry based on Lines , 2016, VISIGRAPP.

[40]  Gang Hua,et al.  Discriminative Learning of Local Image Descriptors , 1990, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Albert Gordo,et al.  Deep Image Retrieval: Learning Global Representations for Image Search , 2016, ECCV.

[42]  Luc Van Gool,et al.  Scale-invariant line descriptors for wide baseline matching , 2014, IEEE Winter Conference on Applications of Computer Vision.

[43]  Il Hong Suh,et al.  Building a 3-D Line-Based Map Using Stereo SLAM , 2015, IEEE Transactions on Robotics.

[44]  Richard I. Hartley,et al.  A linear method for reconstruction from lines and points , 1995, Proceedings of IEEE International Conference on Computer Vision.

[45]  François Chollet,et al.  Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Xosé R. Fernández-Vidal,et al.  Two-view line matching algorithm based on context and appearance in low-textured images , 2015, Pattern Recognit..

[47]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[48]  Tomasz Trzcinski,et al.  BinGAN: Learning Compact Binary Descriptors with a Regularized GAN , 2018, NeurIPS.

[49]  Cordelia Schmid,et al.  The Geometry and Matching of Lines and Curves Over Multiple Views , 2000, International Journal of Computer Vision.

[50]  Yann LeCun,et al.  Learning a similarity metric discriminatively, with application to face verification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[51]  Andrea Vedaldi,et al.  HPatches: A Benchmark and Evaluation of Handcrafted and Learned Local Descriptors , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Yinqiang Zheng,et al.  Stereo Relative Pose from Line and Point Feature Triplets , 2018, ECCV.

[53]  Antonio Torralba,et al.  Evaluation of image features using a photorealistic virtual world , 2011, 2011 International Conference on Computer Vision.

[54]  Reinhard Koch,et al.  An efficient and robust line segment matching approach based on LBD descriptor and pairwise geometric consistency , 2013, J. Vis. Commun. Image Represent..