Visual Navigation Using Heterogeneous Landmarks and Unsupervised Geometric Constraints

We present a heterogeneous landmark-based visual navigation approach for a monocular mobile robot. We utilize heterogeneous visual features, such as points, line segments, lines, planes, and vanishing points, and their inner geometric constraints managed by a novel multilayer feature graph (MFG). Our method extends the local bundle adjustment-based visual simultaneous localization and mapping (SLAM) framework by explicitly exploiting the heterogeneous features and their inner geometric relationships in an unsupervised manner. As the result, our heterogeneous landmark-based visual navigation algorithm takes a video stream as input, initializes and iteratively updates MFG based on extracted key frames, and refines robot localization and MFG landmarks through the process. We present pseudocode for the algorithm and analyze its complexity. We have evaluated our method and compared it with state-of-the-art point landmark-based visual SLAM methods using multiple indoor and outdoor datasets. In particular, on the KITTI dataset, our method reduces the translational error by 52.5% under urban sequences where rectilinear structures dominate the scene.

[1]  Nobuo Yamashita,et al.  On a Global Complexity Bound of the Levenberg-Marquardt Method , 2010, J. Optim. Theory Appl..

[2]  José A. Castellanos,et al.  Robocentric map joining: Improving the consistency of EKF-SLAM , 2007, Robotics Auton. Syst..

[3]  Kurt Konolige,et al.  FrameSLAM: From Bundle Adjustment to Real-Time Visual Mapping , 2008, IEEE Transactions on Robotics.

[4]  G. Klein,et al.  Parallel Tracking and Mapping for Small AR Workspaces , 2007, 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality.

[5]  Andrew J. Davison,et al.  Real-time simultaneous localisation and mapping with a single camera , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[6]  Wolfram Burgard,et al.  An efficient fastSLAM algorithm for generating maps of large-scale cyclic environments from raw laser range measurements , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[7]  Andrew Calway,et al.  Unifying Planar and Point Mapping in Monocular SLAM , 2010, BMVC.

[8]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[9]  Roberto Cipolla,et al.  Multi-view stereo via volumetric graph-cuts , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[10]  Lina María Paz,et al.  Large-Scale 6-DOF SLAM With Stereo-in-Hand , 2008, IEEE Transactions on Robotics.

[11]  Simone Frintrop,et al.  Attentional Landmarks and Active Gaze Control for Visual SLAM , 2008, IEEE Transactions on Robotics.

[12]  W. Burgard,et al.  RAWSEEDS: Robotics Advancement through Web-publishing of Sensorial and Elaborated Extensive Data Sets , 2010 .

[13]  Yan Lu,et al.  High level landmark-based visual navigation using unsupervised geometric constraints in local bundle adjustment , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[14]  Tianli Yu,et al.  Shape and View Independent Reflectance Map from Multiple Views , 2004, International Journal of Computer Vision.

[15]  Rafael Grompone von Gioi,et al.  LSD: A Fast Line Segment Detector with a False Detection Control , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Barbara Caputo,et al.  Multi-modal Semantic Place Classification , 2010, Int. J. Robotics Res..

[17]  Pushmeet Kohli,et al.  Geometric Image Parsing in Man-Made Environments , 2010, International Journal of Computer Vision.

[18]  Dieter Fox,et al.  RGB-D mapping: Using Kinect-style depth cameras for dense 3D modeling of indoor environments , 2012, Int. J. Robotics Res..

[19]  Simon Lacroix,et al.  Monocular-vision based SLAM using Line Segments , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[20]  Wolfram Burgard,et al.  3-D Mapping With an RGB-D Camera , 2014, IEEE Transactions on Robotics.

[21]  Il Hong Suh,et al.  Loop closure through vanishing points in a line-based monocular SLAM , 2012, 2012 IEEE International Conference on Robotics and Automation.

[22]  Hauke Strasdat,et al.  Real-time monocular SLAM: Why filter? , 2010, 2010 IEEE International Conference on Robotics and Automation.

[23]  Ian D. Reid,et al.  Vast-scale Outdoor Navigation Using Adaptive Relative Bundle Adjustment , 2010, Int. J. Robotics Res..

[24]  Ryan M. Eustice,et al.  Learning visual feature descriptors for dynamic lighting conditions , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[25]  Tom Drummond,et al.  Edge landmarks in monocular SLAM , 2009, Image Vis. Comput..

[26]  Javier Civera,et al.  1‐Point RANSAC for extended Kalman filtering: Application to real‐time structure from motion and visual odometry , 2010, J. Field Robotics.

[27]  Horst Bischof,et al.  Fusion of Feature- and Area-Based Information for Urban Buildings Modeling from Aerial Imagery , 2008, ECCV.

[28]  Clark C. Guest,et al.  Parallel, real-time monocular visual odometry , 2013, 2013 IEEE International Conference on Robotics and Automation.

[29]  David Nistér,et al.  An efficient solution to the five-point relative pose problem , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Anthony Hoogs,et al.  A Minimum Error Vanishing Point Detection Approach for Uncalibrated Monocular Images of Man-Made Environments , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Yan Lu,et al.  Automatic building exterior mapping using multilayer feature graphs , 2013, 2013 IEEE International Conference on Automation Science and Engineering (CASE).

[32]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Andras Majdik,et al.  Adaptive appearance based loop-closing in heterogeneous environments , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[34]  Ji Zhang,et al.  Error Aware Monocular Visual Odometry using Vertical Line Pairs for Small Robots in Urban Areas , 2010, AAAI.

[35]  Stefano Soatto,et al.  Multi-View Stereo Reconstruction of Dense Shape and Complex Appearance , 2005, International Journal of Computer Vision.

[36]  Teresa A. Vidal-Calleja,et al.  Undelayed initialization of line segments in monocular SLAM , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[37]  Carlo Tomasi,et al.  Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[38]  J.-Y. Bouguet,et al.  Pyramidal implementation of the lucas kanade feature tracker , 1999 .

[39]  Francisco Angel Moreno,et al.  The Málaga urban dataset: High-rate stereo and LiDAR in a realistic urban scenario , 2014, Int. J. Robotics Res..

[40]  Walterio W. Mayol-Cuevas,et al.  Discovering Higher Level Structure in Visual SLAM , 2008, IEEE Transactions on Robotics.

[41]  Richard Szeliski,et al.  A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[42]  Frank Dellaert,et al.  Incremental Light Bundle Adjustment , 2012, BMVC.

[43]  Silvio Savarese,et al.  Automatic Extrinsic Calibration of Vision and Lidar by Maximizing Mutual Information , 2015, J. Field Robotics.

[44]  Ian D. Reid,et al.  Growing semantically meaningful models for visual SLAM , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[45]  Michel Dhome,et al.  Monocular Vision for Mobile Robot Localization and Autonomous Navigation , 2007, International Journal of Computer Vision.

[46]  Christian Früh,et al.  Data Processing Algorithms for Generating Textured 3D Building Facade Meshes from Laser Scans and Camera Images , 2005, International Journal of Computer Vision.

[47]  Wolfram Burgard,et al.  G2o: A general framework for graph optimization , 2011, 2011 IEEE International Conference on Robotics and Automation.

[48]  Cordelia Schmid,et al.  Automatic line matching across views , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[49]  Julius Ziegler,et al.  StereoScan: Dense 3d reconstruction in real-time , 2011, 2011 IEEE Intelligent Vehicles Symposium (IV).

[50]  NistérDavid An Efficient Solution to the Five-Point Relative Pose Problem , 2004 .

[51]  Jan-Michael Frahm,et al.  Piecewise planar and non-planar stereo for urban scene reconstruction , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[52]  Yan Lu,et al.  A two-view based multilayer feature graph for robot navigation , 2012, 2012 IEEE International Conference on Robotics and Automation.

[53]  Michel Dhome,et al.  Generic and real-time structure from motion using local bundle adjustment , 2009, Image Vis. Comput..

[54]  Luc Van Gool,et al.  3D Urban Scene Modeling Integrating Recognition and Reconstruction , 2008, International Journal of Computer Vision.

[55]  Ian D. Reid,et al.  Real-Time Monocular SLAM with Straight Lines , 2006, BMVC.

[56]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[57]  Paul H. J. Kelly,et al.  Dense planar SLAM , 2014, 2014 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

[58]  Kostas Daniilidis,et al.  Monocular visual odometry in urban environments using an omnidirectional camera , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[59]  Zhanyi Hu,et al.  Line matching leveraged by point correspondences , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[60]  V. Lepetit,et al.  EPnP: An Accurate O(n) Solution to the PnP Problem , 2009, International Journal of Computer Vision.

[61]  Luc Van Gool,et al.  Wide-baseline stereo matching with line segments , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[62]  Sebastian Thrun,et al.  Probabilistic robotics , 2002, CACM.

[63]  Hanumant Singh,et al.  Visually augmented navigation in an unstructured environment using a delayed state history , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.