Robust localization for planar moving robot in changing environment: A perspective on density of correspondence and depth

Visual localization for planar moving robot is important to various indoor service robotic applications. To handle the textureless areas and frequent human activities in indoor environments, a novel robust visual localization algorithm which leverages dense correspondence and sparse depth for planar moving robot is proposed. The key component is a minimal solution which computes the absolute camera pose with one 3D-2D correspondence and one 2D-2D correspondence. The advantages are obvious in two aspects. First, the robustness is enhanced as the sample set for pose estimation is maximal by utilizing all correspondences with or without depth. Second, no extra effort for dense map construction is required to exploit dense correspondences for handling textureless and repetitive texture scenes. That is meaningful as building a dense map is computational expensive especially in large scale. Moreover, a probabilistic analysis among different solutions is presented and an automatic solution selection mechanism is designed to maximize the success rate by selecting appropriate solutions in different environmental characteristics. Finally, a complete visual localization pipeline considering situations from the perspective of correspondence and depth density is summarized and validated on both simulation and public real-world indoor localization dataset. The code is released on github.

[1]  Torsten Sattler,et al.  Benchmarking 6DOF Outdoor Visual Localization in Changing Conditions , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[2]  Torsten Sattler,et al.  To Learn or Not to Learn: Visual Localization from Essential Matrices , 2019, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[3]  Zuzana Kukelova,et al.  Closed-Form Solutions to Minimal Absolute Pose Problems with Known Vertical Direction , 2010, ACCV.

[4]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[5]  Wei Yang,et al.  Are We Ready for Service Robots? The OpenLORIS-Scene Datasets for Lifelong SLAM , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[6]  D. Viswanathan,et al.  Features from Accelerated Segment Test ( FAST ) , 2011 .

[7]  Roland Siegwart,et al.  Maplab: An Open Framework for Research in Visual-Inertial Mapping and Localization , 2017, IEEE Robotics and Automation Letters.

[8]  Iasonas Kokkinos,et al.  Discriminative Learning of Deep Convolutional Feature Point Descriptors , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[9]  Roland Siegwart,et al.  Using multi-camera systems in robotics: Efficient solutions to the NPnP problem , 2013, 2013 IEEE International Conference on Robotics and Automation.

[10]  Jianliang Tang,et al.  Complete Solution Classification for the Perspective-Three-Point Problem , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  José-Raúl Ruiz-Sarmiento,et al.  Robot@Home, a robotic dataset for semantic mapping of home environments , 2017, Int. J. Robotics Res..

[12]  Marc Pollefeys,et al.  A Minimal Case Solution to the Calibrated Relative Pose Problem for the Case of Two Known Orientation Angles , 2010, ECCV.

[13]  Gary R. Bradski,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[14]  Roland Siegwart,et al.  BRISK: Binary Robust invariant scalable keypoints , 2011, 2011 International Conference on Computer Vision.

[15]  Sunglok Choi,et al.  Performance Evaluation of RANSAC Family , 2009, BMVC.

[16]  Chieh-Chih Wang,et al.  2-point RANSAC for scene image matching under large viewpoint changes , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[17]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[18]  Martin Humenberger,et al.  R2D2: Reliable and Repeatable Detector and Descriptor , 2019, NeurIPS.

[19]  Sebastian Thrun,et al.  Probabilistic robotics , 2002, CACM.

[20]  Hongdong Li,et al.  UPnP: An Optimal O(n) Solution to the Absolute Pose Problem with Universal Applicability , 2014, ECCV.

[21]  Szymon Rusinkiewicz,et al.  Learning to Detect Features in Texture Images , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[22]  Yue Wang,et al.  2-Entity Random Sample Consensus for Robust Visual Localization: Framework, Methods, and Verifications , 2020, IEEE Transactions on Industrial Electronics.

[23]  Stergios I. Roumeliotis,et al.  An Efficient Algebraic Solution to the Perspective-Three-Point Problem , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Torsten Sattler,et al.  InLoc: Indoor Visual Localization with Dense Matching and View Synthesis , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[25]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[26]  Tomasz Malisiewicz,et al.  SuperPoint: Self-Supervised Interest Point Detection and Description , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[27]  Josef Sivic,et al.  NCNet: Neighbourhood Consensus Networks for Estimating Image Correspondences , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Krystian Mikolajczyk,et al.  Key.Net: Keypoint Detection by Handcrafted and Learned CNN Filters , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[29]  Krystian Mikolajczyk,et al.  Learning local feature descriptors with triplets and shallow convolutional neural networks , 2016, BMVC.

[30]  David Nistér,et al.  An efficient solution to the five-point relative pose problem , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Andrew Zisserman,et al.  Learning Local Feature Descriptors Using Convex Optimisation , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Jan-Michael Frahm,et al.  Structure-from-Motion Revisited , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[34]  J. M. M. Montiel,et al.  ORB-SLAM: A Versatile and Accurate Monocular SLAM System , 2015, IEEE Transactions on Robotics.

[35]  V. Lepetit,et al.  EPnP: An Accurate O(n) Solution to the PnP Problem , 2009, International Journal of Computer Vision.

[36]  Torsten Sattler,et al.  Quad-Networks: Unsupervised Learning to Rank for Interest Point Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).