Toward Object-based Place Recognition in Dense RGB-D Maps

Longterm localization and mapping requires the ability to detect when places are being revisited to “close loops” and mitigate odometry drift. The appearance-based approaches solve this problem by using visual descriptors to associate camera imagery. This method has proven remarkably successful, yet performance will always degrade with drastic changes in viewpoint or illumination. In this paper, we propose to leverage the recent results in dense RGB-D mapping to perform place recognition in the space of objects. We detect objects from the dense 3-D data using a novel feature descriptor generated using primitive kernels. These objects are then connected in a sparse graph which can be quickly searched for place matches. The developed algorithm allows for multi-floor or multi-session building-scale dense mapping and is invariant to viewpoint and illumination. We validate the approach on a number of real datasets collected with a handheld RGB-D camera.

[1]  Gordon Wyeth,et al.  SeqSLAM: Visual route-based navigation for sunny summer days and stormy winter nights , 2012, 2012 IEEE International Conference on Robotics and Automation.

[2]  John J. Leonard,et al.  Deformation-based loop closure for large scale dense RGB-D SLAM , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[3]  John J. Leonard,et al.  Toward lifelong object segmentation from change detection in dense RGB-D maps , 2013, 2013 European Conference on Mobile Robots.

[4]  John J. Leonard,et al.  An Online Sparsity-Cognizant Loop-Closure Algorithm for Visual Navigation , 2014, Robotics: Science and Systems.

[5]  Dieter Fox,et al.  RGB-D Mapping: Using Depth Cameras for Dense 3D Modeling of Indoor Environments , 2010, ISER.

[6]  Edwin Olson,et al.  Real-time correlative scan matching , 2009, 2009 IEEE International Conference on Robotics and Automation.

[7]  Jean Ponce,et al.  A Tensor-Based Algorithm for High-Order Graph Matching , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Andrew W. Fitzgibbon,et al.  KinectFusion: Real-time dense surface mapping and tracking , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[9]  Paul Newman,et al.  FAB-MAP: Probabilistic Localization and Mapping in the Space of Appearance , 2008, Int. J. Robotics Res..

[10]  Nico Blodow,et al.  Fast Point Feature Histograms (FPFH) for 3D registration , 2009, 2009 IEEE International Conference on Robotics and Automation.

[11]  Dieter Fox,et al.  Toward online 3-D object segmentation and mapping , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[12]  John J. Leonard,et al.  Physical Words for Place Recognition in Dense RGB-D Maps , 2014 .

[13]  John J. Leonard,et al.  Kintinuous: Spatially Extended KinectFusion , 2012, AAAI 2012.

[14]  John J. Leonard,et al.  Temporally scalable visual SLAM using a reduced pose graph , 2013, 2013 IEEE International Conference on Robotics and Automation.

[15]  Dieter Fox,et al.  RGB-D object discovery via multi-scene analysis , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[16]  Michael Milford,et al.  Vision-based place recognition: how low can you go? , 2013, Int. J. Robotics Res..

[17]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[18]  Paul Newman,et al.  FAB-MAP 3D: Topological mapping with spatial and visual appearance , 2010, 2010 IEEE International Conference on Robotics and Automation.

[19]  Martial Hebert,et al.  A spectral technique for correspondence problems using pairwise constraints , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[20]  Dieter Fox,et al.  Depth kernel descriptors for object recognition , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[21]  Paul Newman,et al.  Appearance-only SLAM at large scale with FAB-MAP 2.0 , 2011, Int. J. Robotics Res..

[22]  Federico Tombari,et al.  A combined texture-shape descriptor for enhanced 3D feature matching , 2011, 2011 18th IEEE International Conference on Image Processing.

[23]  Ian D. Reid,et al.  Article in Press Robotics and Autonomous Systems ( ) – Robotics and Autonomous Systems a Comparison of Loop Closing Techniques in Monocular Slam , 2022 .

[24]  Nassir Navab,et al.  Model globally, match locally: Efficient and robust 3D object recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[25]  Paul H. J. Kelly,et al.  SLAM++: Simultaneous Localisation and Mapping at the Level of Objects , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[27]  Andrew E. Johnson,et al.  Using Spin Images for Efficient Object Recognition in Cluttered 3D Scenes , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[28]  Frank Dellaert,et al.  SLAM with object discovery, modeling and mapping , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.